zlacker

If there’s any type of memory upgrade for a coding agent I would want, it’s the ability to integrate a RAG into the context.

The information being available is not the problem; the agent not realizing that it doesn’t have all the info is, though. If you put it behind an MCP server, it becomes a matter of ensuring the agent will invoke the MCP at the right moment, which is a whole challenge in itself.

Are there any coding agents out there that enable you to plug middleware in there? I’ve been thinking about MITM’ing Claude Code for this, but wouldn’t mind exploring alternative options.

replies(1): >>simonw+C

>>stingr+(OP)
What do you mean by a RAG here?

I've been having a ton of success just from letting them use their default grep-style search tools.

I have a folder called ~/dev/ with several hundred git projects checked out, and I'll tell Claude Code things like "search in ~/dev/ for relevant examples and documentation".

(I'd actually classify what I'm doing there as RAG already.)

replies(2): >>stingr+e2 >>qudat+W7

>>simonw+C
What I mean is basically looking at the last (few) messages in the context, translating that to a RAG query, query your embeddings database + BM25 lookup if desired, and if you find something relevant inject that right before the last message in the context.

It’s pretty common in a lot of agents, but I don’t see a way to do that with Claude Code.

replies(1): >>UmGuys+6u

>>simonw+C
I do the same thing for libraries I’m using in project. It’s a huge power up for code agents.

Like you mentioned, agents are insanely good at grep. So much so that I’ve been trying to figure out how to create an llmgrep tool because it’s so good at it. Like, I want to learn how to be that good at grep, hah.

>>stingr+e2
I'm not familiar with Claude's architecture, but I'd be surprised if it doesn't index your codebase for semantic search with the explore feature it has. How else would they find context? They already have a semantic search tool -- which is rag.

replies(1): >>simonw+II

>>UmGuys+6u
Claude Code doesn't do anything with semantic search or embeddings out of the box. They use a simple grep tool instead.

Neither does OpenAI's Codex CLI - you can confirm that by looking at the source code https://github.com/openai/codex

Cursor and Windsurf both use semantic search via embeddings.

You can get semantic search in Claude Code using this unofficial plugin: https://github.com/zilliztech/claude-context - it's built by and uses a managed vector database called Zilliz Cloud.

replies(1): >>UmGuys+KL

>>simonw+II
That's shocking to me. Although it does make sense from a UX perspective as indexing can take minutes depending on the setup.

replies(1): >>stingr+qy1

>>UmGuys+KL
It’s surprisingly fast to generate embeddings. I don’t think it’s a UX issue as much as it’s that Anthropic themselves don’t offer any embeddings API (they only have one internally, but publicly recommend Cohere).

They do use RAGs a lot for their desktop app, their projects implementation make a lot of use of it.