One thing I've noticed building with Claude Code is that it's pretty aggressive about reading .env files and config when it has access. The proxy approach sidesteps that entirely since there's nothing sensitive to find in the first place.
Wonder if the Anthropic team has considered building something like this into the sandbox itself - a secrets store that the model can "use" but never "read".
How would that work? If the AI can use it, it can read it. E.g:
secret-store "foo" > file
cat file
You'd have to be very specific about how the secret can be used in order for the AI to not be able to figure out what it is. You could provide a http proxy in the sandbox that injects a HTTP header to include the secret, when the secret is for accessing a website for example, and tell the AI to use that proxy. But you'd also have to scope down which URLs the proxy can access with that secret otherwise it could just visit a page like this to read back the headers that were sent:https://www.whatismybrowser.com/detect/what-http-headers-is-...
Basically, for every "use" of a secret, you'd have to write a dedicated application which performs that task in a secure manner. It's not just the case of adding a special secret store.
We spent the last 50 years of computer security getting to a point where we keep sensitive credentials out of the hands of humans. I guess now we have to take the next 50 years to learn the lesson that we should keep those same credentials out of the hands of LLMs as well?
I'll be sitting on the sideline eating popcorn in that case.