For example I would love for an agent to do my grocery shopping for me, but then I have to give it access to my credit card.
It is the same issue with travel.
What other useful tasks can one offload to the agents without risk?
Control all input out of it with proper security controls on it.
While not perfect it aleast gives you a fighting chance when your AI decides to send a random your SSN and a credit card to block it.
Claude code asks me over and over "can I run this shell command?" and like everyone else, after the 5th time I tell it to run everything and stop asking.
Maybe using a credit card can be gated since you probably don't make frequent purchases, but frequently-used API keys are a lost cause. Humans are lazy.
That's the hard part: how?
With the right prompt, the confined AI can behave as maliciously (and cleverly) as a human adversary--obfuscating/concealing sensitive data it manipulates and so on--so how would you implement security controls there?
It's definitely possible, but it's also definitely not trivial. "I want to de-risk traffic to/from a system that is potentially an adversary" is ... most of infosec--the entire field--I think. In other words, it's a huge problem whose solutions require lots of judgement calls, expertise, and layered solutions, not something simple like "just slap a firewall on it and look for regex strings matching credit card numbers and you're all set".
The problem simply put is as difficult as:
Given a human running your system how do you prevent them damaging it. AI is effectively thr same problem.
Outsourcing has a lot of interesting solutions around this. They already focus heavily on "not entirely trusted agent" with secure systems. They aren't perfect but it's a good place to learn.
You trust the configuration level not the execution level.
API keys are honestly an easy fix. Claude code already has build in proxy ability. I run containers where claude code has a dummy key and all requestes are proxied out and swapped off system for them.