Better RAG Results with Reciprocal Rank Fusion and Hybrid Search

>>johnjw+(OP)
I also found pure RAG with vector search to not work. I was creating a bot that could find answers to questions about things by looking at Slack discussions.

At first, I downloaded entire channels, loaded them into a vector DB, and did RAG. The results sucked. Vector searches don't understand things very well, and in this world, specific keywords and error messages are very searchable.

Instead, I take the user's query, ask an LLM (Claude / Bedrock) to find keywords, then search Slack using the API, get results, and use an LLM to filter for discussions that are relevant, then summarize them all in a response.

This is slow, of course, so it's very multi-threaded. A typical response will be within 30 seconds.

>>thefou+G71
Zero shot key phrase extraction is a reasonably well-studied field. I don’t know what the current SOTA is, but the one that was pretty hot shit last time I needed one was kbir-inspec which is on HuggingFace and you can test it right on the page.

Might be worth a shot if performance is a tricky spot in your setup.

zlacker