Gemini 2.5 Pro Preview

>>meetpa+(OP)
My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.

There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

>>segpha+J4
Making LLMs know what they don't know is a hard problem. Many attempts at making them refuse to answer what they don't know caused them to refuse to answer things they did in fact know.

>>redox9+na
> Many attempts at making them refuse to answer what they don't know caused them to refuse to answer things they did in fact know.

Are we sure they know these things as opposed to being able to consistently guess correctly? With LLMs I'm not sure we even have a clear definition of what it means for it to "know" something.

>>Volund+hk
Yes. You could ask for factual information like "Tallest building in X place" and first it would answer it did not know. After pressuring it, it would answer with the correct building and height.

But also things where guessing was desirable. For example with a riddle it would tell you it did not know or there wasn't enough information. After pressuring it to answer anyway it would correctly solve the riddle.

The official llama 2 finetune was pretty bad with this stuff.

zlacker