Also, trying something new out will most likely have hiccups. Ultimately it may fail. But that doesn't mean it's not worth the effort.
The thing may rapidly evolve if it's being hard-tested on actual code and actual issues. For example it will be probably changed so that it will iterate until tests are actually running (and maybe some static checking can help it, like not deleting tests).
Waiting to see what happens. I expect it will find its niche in development and become actually useful, taking off menial tasks from developers.
Reading AI generated code is arguably far more annoying than any menial task. Especially if the said code happens to have subtle errors.
Speaking from experience.
Reviewing what the AI does now is not to be compared with human PRs. You are not doing the work as it is expected in the (hopefully near?) future but you are training the AI and the developers of the AI and more crucially: you are digging out failure modes to fix.
It would definitely be nice to be wrong though. That'd make life so much easier.