They measure the old gemini 2.5 generating proper diffs 92% of the time. I bet this goes up to ~95-98% https://aider.chat/docs/leaderboards/
Question for the google peeps who monitor these threads: Is gemini-2.5-pro-exp (free tier) updated as well, or will it go away?
Also, in the blog post, it says:
> The previous iteration (03-25) now points to the most recent version (05-06), so no action is required to use the improved model, and it continues to be available at the same price.
Does this mean gemini-2.5-pro-preview-03-25 now uses 05-06? Does the same apply to gemini-2.5-pro-exp-03-25?update: I just tried updating the date in the exp model (gemini-2.5-pro-exp-05-06) and that doesnt work.
I don't have a formal benchmark but there's a notable improvement in code generation due to this alone.
I've had gemini chug away on plans that have taken ~1 hour to implement. (~80mln tokens spent) A good portion of that energy was spent fixing mistakes made by cline/aider/roo due to search/replace mistakes. If this model gets anywhere close to 100% on diffs then this is a BFD. I estimate this will translate to a 50-75% productivity boost on long context coding tasks. I hope the initial results i'm seeing hold up!
I'm surprised by the reaction in the rest of the thread. A lot unproductive complaining, a lot of off topic stuff, nothing talking about the model itself.
Any thoughts from anyone else using the updated model?
Does this 2.5 pro "Preview" feel like an improvement if you had used the others?