Gemini 2.5 Pro Preview

>>meetpa+(OP)
My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.

There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

>>segpha+J4
> no amount of prompting will get current models to approach abstraction and architecture the way a person does

I find this sentiment increasingly worrisome. It's entirely clear that every last human will be beaten on code design in the upcoming years (I am not going to argue if it's 1 or 5 years away, who cares?)

I wished people would just stop holding on to what amounts to nothing, and think and talk more about what can be done in a new world. We need good ideas and I think this could be a place to advance them.

>>jstumm+jH
> It's entirely clear that every last human will be beaten on code design in the upcoming years

Citation needed. In fact, I think this pretty clearly hits the "extraordinary claims require extraordinary evidence" bar.

>>DanHul+2P
I would argue that what LLMs are capable of doing right now is already pretty extraordinary, and would fulfil your extraordinary evidence request. To turn it on its head - given the rather astonishing success of the recent LLM training approaches, what evidence do you have that these models are going to plateau short of your own abilities?

>>sweezy+k61
What they do is extraordinary, but it's not just a claim, they actually do, their doing so is evidence.

Here someone just claimed that it is "entirely clear" LLMs will become super-human, without any evidence.

https://en.wikipedia.org/wiki/Extraordinary_claims_require_e...

>>sigmai+U61
Again - I'd argue that the extraordinary success of LLMs, in a relatively short amount of time, using a fairly unsophisticated training approach, is strong evidence that coding models are going to get a lot better than they are right now. Will it definitely surpass every human? I don't know, but I wouldn't say we're lacking extraordinary evidence for that claim either.

The way you've framed it seems like the only evidence you will accept is after it's actually happened.

>>sweezy+G71
Well, predicting the future is always hard. But if someone claims some extraordinary future event is going to happen, you at least ask for their reasons for claiming so, don't you.

In my mind, at this point we either need (a) some previously "hidden" super-massive source of training data, or (b) another architectural breakthrough. Without either, this is a game of optimization, and the scaling curves are going to plateau really fast.

>>sigmai+681
A couple of comments

a) it hasn't even been a year since the last big breakthrough, the reasoning models like o3 only came out in September, and we don't know how far those will go yet. I'd wait a second before assuming the low-hanging fruit is done.

b) I think coding is a really good environment for agents / reinforcement learning. Rather than requiring a continual supply of new training data, we give the model coding tasks to execute (writing / maintaining / modifying) and then test its code for correctness. We could for example take the entire history of a code-base and just give the model its changing unit + integration tests to implement. My hunch (with no extraordinary evidence) is that this is how coding agents start to nail some of the higher-level abilities.

zlacker