Better RAG Results with Reciprocal Rank Fusion and Hybrid Search

>>johnjw+(OP)
Thanks for sharing, I like the approach and it makes a lot of sense for the problem space. Especially using existing products vs building/hosting your own.

I was however tripped up by this sentence close to the beginning:

> we encountered a significant challenge with RAG: relying solely on vector search (even using both dense and sparse vectors) doesn’t always deliver satisfactory results for certain queries.

Not to be overly pedantic, but that's a problem with vector similarity, not RAG as a concept.

Although the author is clearly aware of that - I have had numerous conversations in the past few months alone of people essentially saying "RAG doesn't work because I use pg_vector (or whatever) and it never finds what I'm looking for" not realizing 1) it's not the only way to do RAG, and 2) there is often a fair difference between the embeddings and the vectorized query, and with awareness of why that is you can figure out how to fix it.

https://medium.com/@cdg2718/why-your-rag-doesnt-work-9755726... basically says everything I often say to people with RAG/vector search problems but again, seems like the assembled team has it handled :)

>>edude0+39
Author here: you're for sure right -- it's not a problem with RAG the theoretical concept. In fact, I think RAG implementations should likely be specific to their use cases (e.g. our hybrid search approach works well for customer support, but I'm not sure if it would work as well in other contexts, say for legal bots).

I've seen the whole gamut of RAG implementations as well, and the implementation, specifically prompting and the document search has a lot to do with the end quality.

>>johnjw+V9
re: legal, I saw a post on this idea where their RAG system was designed to return the actual text from the document rather than a LLM response or summary. The LLM played a role in turning the query into the search params, but the insight was that for certain kinds of documents, you want the actual source because of the existing, human written summary or the detailed nuances therein

>>verdve+zo
Sounds more like Generation Augmented Retrieval in that case.

zlacker