zlacker

[parent] [thread] 0 comments
1. dgunay+(OP)[view] [source] 2026-01-28 10:48:27
I ran an experiment at work where I was able to adversarially prompt inject a Yolo mode code review agent into approving a pr just by editing the project's AGENTS.md in the pr. A contrived example (obviously the solution is to not give a bot approval power) but people are running Yolo agents connected to the internet with a lot of authority. It's very difficult to know exactly what the model will consider malicious or not.
[go to top]