The execution sandbox stops the agent from breaking out during development, but the real risk is what gets shipped downstream. Seeing more tools now that scan the generated code itself, not just contain the execution environment.
See, my typical execution environment is a Linux vm or laptop, with a wide variety of SSH and AWS keys configured and ready to be stolen (even if they are temporary, it's enough to infiltrate prod, or do some sneaky lateral movement attack). On the other hand, typical application execution environment is an IAM user/role with strictly scoped permissions.