If we have an intern junior dev on our team do we expect them to be 100% totally correct all the time? Why do we have a culture of peer code reviews at all if we assume that every one who commits code is 100% foolproof and correct 100% of the time?
Truth is we don't trust all the humans that write code to be perfect. As the old-as-the-hills saying goes "we all make mistakes". So replace "LLM" in your comment above with "junior dev" and everything you said still applies wether it is LLMs or inexperienced colleagues. With code, there is very rarely a single "correct" answer to how to implement something (unlike the calculator tautology you suggest) anyway, so an LLM or an intern (or even an experienced colleague) absolutely nailing their PRs with zero review comments etc seems unusual to me.
So we go back to the original - and I admit quite philosophical - point: when will we be happy? We take on juniors because they do the low-level and boring work and we need to keep an eye on their output until they learn and grow and improve ... but we cannot do the same for a LLM?
What we have today was literally science fiction not so long ago (e.g. "Her" movie from 2013 is now a reality pretty much). Step back for a moment - the fact we are even having this discussion that "yeah it writes code but it needs to be checked" is just mind-blowing that it even writes code that is mostly-correct at all. Give things another couple of years and its going to be even better.