https://www.kaggle.com/competitions/konwinski-prize/
Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful. I know that open source models are not as good as closed source, but still, we're a long way from LLMs being good for code on their own.
And that supports OP's point - these tools aren't AGI, they produce trash that needs evaluation, but they're still useful.