Memory grows linearly, compute grows quadratically (but with small constant - until ~100k the inference will be still dominated by non-quadratic factors).
Also reusing key/values for different queries can compress the KV cache, it can be an 1000x or 10000x improvement in bandwidth if the model is trained for it.
Just to clarify: simple prefix KV cache doesn't require any special model training. It does require the inference framework to support it, but most do by now.
You can see dramatic improvements in latency and throughput if there is a large shared prefix of the queries.