Gemini 2.5 Pro Preview

>>meetpa+(OP)
My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.

There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

>>segpha+J4
> no amount of prompting will get current models to approach abstraction and architecture the way a person does

I find this sentiment increasingly worrisome. It's entirely clear that every last human will be beaten on code design in the upcoming years (I am not going to argue if it's 1 or 5 years away, who cares?)

I wished people would just stop holding on to what amounts to nothing, and think and talk more about what can be done in a new world. We need good ideas and I think this could be a place to advance them.

>>jstumm+jH
I code with multiple LLMs every day and build products that use LLM tech under the hood. I dont think we're anywhere near LLMs being good at code design. Existing models make _tons_ of basic mistakes and require supervision even for relatively simple coding tasks in popular languages, and its worse for languages and frameworks that are less represented in public sources of training data. I am _frequently_ having to tell Claude/ChatGPT to clean up basic architectural and design defects. Theres no way I would trust this unsupervised.

Can you point to _any_ evidence to support that human software development abilities will be eclipsed by LLMs other than trying to predict which part of the S-curve we're on?

>>ssalaz+gg1
I can't point to any evidence. Also I can't think of what direct evidence I could present that would be convincing, short of an actual demonstration? I would like to try to justify my intuition though:

Seems like the key question is: should we expect AI programming performance to scale well as more compute and specialised training is thrown at it? I don't see why not, it seems an almost ideal problem domain?

* Short and direct feedback loops

* Relatively easy to "ground" the LLM by running code

* Self-play / RL should be possible (it seems likely that you could also optimise for aesthetics of solutions based on common human preferences)

* Obvious economic value (based on the multi-billion dollar valuations of vscode forks)

All these things point to programming being "solved" much sooner than say, chemistry.

>>xyzzy1+Cw1
This is correct. No idea how people don't see this trend or consider it

zlacker