Gemini 2.5 Pro Preview

>>meetpa+(OP)
Interestingly, when compering benchmarks of Experimental 03-25 [1] and Experimental 05-06 [2] it seems the new version scores slightly lower in everything except on LiveCodeBench.

[1] https://storage.googleapis.com/model-cards/documents/gemini-... [2] https://deepmind.google/technologies/gemini/

>>andy12+F8
This should be the top comment. Cherry-picking is hurting this industry.

I bet they kept training on coding tasks, made everything worse on the way, and tried to hide it under the rug because of the sunk costs.

zlacker