Fair point on latency, we (Azure AI Search) target both scenarios with different features. For instant search you can just do the usual hybrid + rerank combo, or if you want query rewriting to improve user queries, you can enable QR at a moderate latency hit. We evaluated this approach at length here: https://techcommunity.microsoft.com/blog/azure-ai-foundry-bl...
Of course, agentic retrieval is just better quality-wise for a broader set of scenarios, usual quality-latency trade-off.
We don't do SPLADE today. We've explored it and may get back to it at some point, but we ended up investing more on reranking to boost precision, we've found we have fewer challenges on the recall side.
Of course, agentic retrieval is just better quality-wise for a broader set of scenarios, usual quality-latency trade-off.
We don't do SPLADE today. We've explored it and may get back to it at some point, but we ended up investing more on reranking to boost precision, we've found we have fewer challenges on the recall side.