There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.
Usually I’m using a minimum of 200k tokens to start with gemini 2.5.
That generally seems right to me, given how much we hold in our heads when you’re discussing something with a coworker.