Gemini 2.5 Pro Preview

>>meetpa+(OP)
My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.

There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

>>segpha+J4
I have been asking if AI without hallucination, coding or not is possible but so far with no real concrete answer.

>>ksec+66
It's already much improved on the early days.

But I wonder when we'll be happy? Do we expect colleagues friends and family to be 100% laser-accurate 100% of the time? I'd wager we don't. Should we expect that from an artificial intelligence too?

>>mattlo+E6
I expect my calculator to be 100% accurate 100% of the time. I have slightly more tolerance for other software having defects, but not much more.

>>kweing+c8
AIs aren't intended to be used as calculators though?

You could say that when I use my spanner/wrench to tighten a nut it works 100% of the time, but as soon as I try to use a screwdriver it's terrible and full of problems and it can't even reliably so something as trivially easy as tighten a nut, even though a screwdriver works the same way by using torque to tighten a fastener.

Well that's because one tool is designed for one thing, and one is designed for another.

>>mattlo+4F
> AIs aren't intended to be used as calculators though?

Then why are we using them to write code, which should produce reliable outputs for a given input...much like a calculator.

Obviously we want the code to produce correct results for whatever input we give, and as it stands now, I can't trust LLM output without reviewing first. Still a helpful tool, but ultimately my desire would be to have them be as accurate as a calculator so they can be trusted enough to not need the review step.

Using an LLM and being OK with untrustworthy results, it'd be like clicking the terminal icon on my dock and sometimes it opens terminal, sometimes it might open a browser, or just silently fail because there's no reproducible output for any given input to an LLM. To me that's a problem, output should be reproducible, especially if it's writing code.

zlacker