There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.
But I wonder when we'll be happy? Do we expect colleagues friends and family to be 100% laser-accurate 100% of the time? I'd wager we don't. Should we expect that from an artificial intelligence too?
And it is also not just about the %. It is also about the type of error. Will we reach a point we change our perception and say these are expected non-human error?
Or could we have a specific LLM that only checks for these types of error?