My biggest RAG learning is to use agentic RAG. (Sorry for buzzword dropping) - C...

googamooga · 2025-10-21T14:23:44 1761056624

I fully support this approach! When I first started experimenting—rather naively—with using tool-enabled LLMs to generate documents (such as reports or ADRs) from the extensive knowledge base in Confluence, I built a few tools to help the LLM search Confluence using CQL (Confluence Query Language) and store the retrieved pages in a dedicated folder. The LLM could then search within that folder with simple filesystem tools and pull entire files into its context as needed. The results were quite good, as long as the context didn’t become overloaded. However, when I later tried to switch to a 'Classic RAG' setup, the output quality dropped significantly and I refrained from switching.

jokethrowaway · 2025-10-21T15:21:54 1761060114

yes but the assistant often doesn't search when it should and very rarely does multiple search rounds (both on gpt5 or on claude sonnet 4.5, weaker models are even worse at tool calling)

pietz · 2025-11-03T13:59:08 1762178348

Cannot confirm this. Both sound like prompting issues.

- Depends on your use case to let the model understand when and when not to use tools - gpt-5 s VERY persistent and often searches more than 10 times in a single run depending on the results.

We're using pydantic AI where the entire Agent loop is taken care of by the framework. Highly recommend.