zlacker

Better RAG Results with Reciprocal Rank Fusion and Hybrid Search

submitted by johnjw+(OP) on 2024-05-30 15:17:39 | 249 points 57 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts
1. edude0+39[view] [source] 2024-05-30 16:07:41
>>johnjw+(OP)
Thanks for sharing, I like the approach and it makes a lot of sense for the problem space. Especially using existing products vs building/hosting your own.

I was however tripped up by this sentence close to the beginning:

> we encountered a significant challenge with RAG: relying solely on vector search (even using both dense and sparse vectors) doesn’t always deliver satisfactory results for certain queries.

Not to be overly pedantic, but that's a problem with vector similarity, not RAG as a concept.

Although the author is clearly aware of that - I have had numerous conversations in the past few months alone of people essentially saying "RAG doesn't work because I use pg_vector (or whatever) and it never finds what I'm looking for" not realizing 1) it's not the only way to do RAG, and 2) there is often a fair difference between the embeddings and the vectorized query, and with awareness of why that is you can figure out how to fix it.

https://medium.com/@cdg2718/why-your-rag-doesnt-work-9755726... basically says everything I often say to people with RAG/vector search problems but again, seems like the assembled team has it handled :)

15. pamela+4P[view] [source] 2024-05-30 19:56:22
>>johnjw+(OP)
If you're looking for an example of RRF + Hybrid Search with PostgreSQL, I've put together a FastAPI app here that uses RAG with those options:

https://github.com/Azure-Samples/rag-postgres-openai-python/

Here's the RRF+Hybrid part: https://github.com/Azure-Samples/rag-postgres-openai-python/...

That's largely based off a sample from the pgvector repo, with a few tweaks.

Agreed that Hybrid is the way to go, it's what the Azure AI Search team also recommends, based off their research:

https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...

17. pmc00+nS[view] [source] 2024-05-30 20:18:39
>>johnjw+(OP)
For another set of measurements that support RRF + Hybrid > vectors, we (Azure AI Search team) did a bunch of evaluations a few months ago: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...

We also included supporting data in that write up showing you can improve significantly on top of Hybrid/RRF using a reranking stage (assuming you have a good reranker model), so we shipped one as an optional step as part of our search engine.

20. gregnr+6W[view] [source] 2024-05-30 20:36:33
>>johnjw+(OP)
In case you just want a single Postgres function that does RRF (pgvector+fts): https://supabase.com/docs/guides/ai/hybrid-search

(disclaimer: supabase dev who went down the rabbit hole with hybrid search)

23. cheesy+D61[view] [source] 2024-05-30 21:40:20
>>johnjw+(OP)
RRF is alright, but I've had better results with relative score, or distribution-based scoring.

LlamaIndex has a module for exactly this

https://docs.llamaindex.ai/en/stable/examples/retrievers/rel...

◧◩
24. pamela+j71[view] [source] [discussion] 2024-05-30 21:44:25
>>cpursl+kC
I commented above with a pgvector example that does it-

>>40527925

◧◩
25. pamela+q71[view] [source] [discussion] 2024-05-30 21:45:33
>>esafak+sF
Re 1) pgvector has an example in the repo that uses a model for re-ranking: https://github.com/pgvector/pgvector-python/blob/master/exam...

I'm not using that in my own experiments since I don't want to worry about the performance of running a model on production, but seems worth a try.

◧◩◪
27. kiwico+o81[view] [source] [discussion] 2024-05-30 21:52:57
>>flexzu+wD
Here is our doc with RRF:

https://supabase.com/docs/guides/ai/hybrid-search

◧◩◪
29. esafak+3a1[view] [source] [discussion] 2024-05-30 22:03:52
>>pamela+q71
That's outside the database, though. This is closer to what I had in mind: https://postgresml.org/blog/how-to-improve-search-results-wi...
33. yingfe+cz1[view] [source] 2024-05-31 02:09:12
>>johnjw+(OP)
RRF is a simple and effective means of fused ranking for multiple recall. Within our open source RAG product RAGFlow(https://github.com/infiniflow/ragflow), Elasticsearch is currently used instead of other general vector databases, because it can provide hybrid search right now. Under the default cases, embedding based reranker is not required, just RRF is enough, while even if reranker is used, keywords based retrieval is also a MUST to be hybridized with embedding based retrieval, that's just what RAGFlow's latest 0.7 release has provided.

On the other hand let me introduce another database we developed, Infinity(https://github.com/infiniflow/infinity), which can provide the hybrid search, you can see the performance here(https://github.com/infiniflow/infinity/blob/main/docs/refere...), both vector search and full-text search could perform much faster than other open source alternatives.

From the next version(weeks later), Infinity will also provide more comprehensive hybrid search capabilities, what you have mentioned the 3-way recalls(dense vector, sparse vector, keyword search) could be provided within single request.

◧◩◪◨⬒
37. verdve+WV1[view] [source] [discussion] 2024-05-31 06:57:33
>>gradys+u91
It wasn't this GAR post, I remember them calling out legal docs explicitly, might have seen it on Twitter

https://blog.luk.sh/rag-vs-gar

◧◩
38. testfo+002[view] [source] [discussion] 2024-05-31 07:46:02
>>yingfe+cz1
Elastic Search is publishing a lot of interesting posts on this topic although with a bit of marketing for ex https://www.elastic.co/search-labs/blog/semantic-reranking-w...
◧◩
43. s-mack+Ti2[view] [source] [discussion] 2024-05-31 11:18:47
>>thefou+G71
I find these discussions funny.

For decades we had search engines based on the query terms (keywords). Then there were lots of discussions and some implementations to put a semantic search on top of it to improve the keyword search. A hybrid search. Google Search did exactly that already in 2015 [1].

Now we start from pure semantic search and put keyword search on top of it to improve the semantic search and call it hybrid search.

In both approaches, the overall search performance is exactly identical - to the last digit.

I am glad, that so far, no one has called this an innovation. But you could certainly write a lot of blog articles about it.

[1] https://searchengineland.com/semantic-search-entity-based-se...

◧◩◪
46. charli+zS2[view] [source] [discussion] 2024-05-31 14:40:41
>>matthe+mC2
We've been building some systems for clients recently including Moody's using Lucene-based engines for the R-part - the G part tends to be OpenAI or some such service but there's also appetite for internally hosted LLMs. The trick is good measurement, as I explained in this talk at State of Open Con. https://www.youtube.com/watch?v=Ghbd1RkNgpM
◧◩◪
55. cpursl+Axa[view] [source] [discussion] 2024-06-03 17:52:10
>>cpursl+bw3
First take at hybrid search with Postgres pg_vector based on this: https://gist.github.com/cpursley/dae0a0be442f27e6af79d6bfc2b...
[go to top]