Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My biggest RAG learning is to use agentic RAG. (Sorry for buzzword dropping)

- Classic RAG: `User -> Search -> LLM -> User`

- Agentic RAG: `User <-> LLM <-> Search`

Essentially instead of having a fixed loop, you provide the search as a tool to the LLM, which does three things:

- The LLM can search multiple times

- The LLM can adjust the search query

- The LLM can use multiple tools

The combination of these three things has solved a majority of classic RAG problems. It improves user queries, it can map abbreviations, it can correct bad results on its own, you can also let it list directories and load files directly.



I fully support this approach! When I first started experimenting—rather naively—with using tool-enabled LLMs to generate documents (such as reports or ADRs) from the extensive knowledge base in Confluence, I built a few tools to help the LLM search Confluence using CQL (Confluence Query Language) and store the retrieved pages in a dedicated folder. The LLM could then search within that folder with simple filesystem tools and pull entire files into its context as needed. The results were quite good, as long as the context didn’t become overloaded. However, when I later tried to switch to a 'Classic RAG' setup, the output quality dropped significantly and I refrained from switching.


yes but the assistant often doesn't search when it should and very rarely does multiple search rounds (both on gpt5 or on claude sonnet 4.5, weaker models are even worse at tool calling)


Cannot confirm this. Both sound like prompting issues.

- Depends on your use case to let the model understand when and when not to use tools - gpt-5 s VERY persistent and often searches more than 10 times in a single run depending on the results.

We're using pydantic AI where the entire Agent loop is taken care of by the framework. Highly recommend.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: