https://www.kaggle.com/competitions/konwinski-prize/
Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful. I know that open source models are not as good as closed source, but still, we're a long way from LLMs being good for code on their own.
And that supports OP's point - these tools aren't AGI, they produce trash that needs evaluation, but they're still useful.
The best intellisense and code completion tools would solve 0.00. Those were the only tools we were using just a couple of years ago. 0.09 is a tremendous jump and the improvements will accelerate!