Cursor IDE support hallucinates lockout policy, causes user cancellations

>>scared+(OP)
There is a certain amount of irony that people try really hard to say that hallucinations are not a big problem anymore and then a company that would benefit from that narrative gets directly hurt by it.

Which of course they are going to try to brush it all away. Better than admitting that this problem very much still exists and isn’t going away anytime soon.

>>nerdjo+A84
It's a huge problem. I just can't get past it and I get burned by it every time I try one of these products. Cursor in particular was one of the worst; the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything. How am I supposed to trust and work with such a tool? To me, it seems like the equivalent of lobbing a live hand grenade into your codebase.

Don't get me wrong, I use AI every day, but it's mostly as a localized code complete or to help me debug tricky issues. Meaning I've written and understand the code myself, and the AI is there to augment my abilities. AI works great if it's used as a deductive tool.

Where it runs into issues is when it's used inductively, to create things that aren't there. When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries, and even entire programming languages on occasion. The AI is more than happy to deliver any kind of information you want, no matter how wrong it is.

AI is not a tool, it's a tiny Kafkaesque bureaucracy inside of your codebase. Does it work today? Yes! Why does it work? Who can say! Will it work tomorrow? Fingers crossed!

>>Modern+ib4
> the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything.

This is not an inherent flaw of LLMs, rather it is a flaw of a particular implementation-if you use guided sampling, so during sampling you only consider tokens allowed by the programming language grammar at that position, it becomes impossible for the LLM to generate ungrammatical output

> When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries,

They can use guided sampling for this too - if you know the set of function names which exist in the codebase and its dependencies, you can reject tokens that correspond to non-existent function names during sampling

Another approach, instead of or as well as guided sampling, is to use an agent with function calling - so the LLM can try compiling the modified code itself, and then attempt to recover from any errors which occur.

zlacker