Cursor IDE support hallucinates lockout policy, causes user cancellations

>>scared+(OP)
There is a certain amount of irony that people try really hard to say that hallucinations are not a big problem anymore and then a company that would benefit from that narrative gets directly hurt by it.

Which of course they are going to try to brush it all away. Better than admitting that this problem very much still exists and isn’t going away anytime soon.

>>nerdjo+A84
It's a huge problem. I just can't get past it and I get burned by it every time I try one of these products. Cursor in particular was one of the worst; the very first time I allowed it to look at my codebase, it hallucinated a missing brace (my code parsed fine), "helpfully" inserted it, and then proceeded to break everything. How am I supposed to trust and work with such a tool? To me, it seems like the equivalent of lobbing a live hand grenade into your codebase.

Don't get me wrong, I use AI every day, but it's mostly as a localized code complete or to help me debug tricky issues. Meaning I've written and understand the code myself, and the AI is there to augment my abilities. AI works great if it's used as a deductive tool.

Where it runs into issues is when it's used inductively, to create things that aren't there. When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries, and even entire programming languages on occasion. The AI is more than happy to deliver any kind of information you want, no matter how wrong it is.

AI is not a tool, it's a tiny Kafkaesque bureaucracy inside of your codebase. Does it work today? Yes! Why does it work? Who can say! Will it work tomorrow? Fingers crossed!

>>Modern+ib4
You're not supposed to trust the tool, you're supposed to review and rework the code before submitting for external review.

I use AI for rather complex tasks. It's impressive. It can make a bunch of non-trivial changes to several files, and have the code compile without warnings. But I need to iterate a few times so that the code looks like what I want.

That being said, I also lose time pretty regularly. There's a learning curve, and the tool would be much more useful if it was faster. It takes a few minutes to make changes, and there may be several iterations.

>>yodsan+Si4
> You're not supposed to trust the tool

This is just an incredible statement. I can't think of another development tool we'd say this about. I'm not saying you're wrong, or that it's wrong to have tools we can't just, just... wow... what a sea change.

>>schmic+Qq4
Imagine if your compiler just randomly and non-deterministically compiled valid code to incorrect binaries, and the tool's developer couldn't really tell you why it happens, how often it was expected to happen, how severe the problem was expected to be, and told you to just not trust your compiler to create correct machine code.

Imagine if your calculator app randomly and non-deterministically performed arithmetic incorrectly, and you similarly couldn't get correctness expectations from the developer.

Imagine if any of your communication tools randomly and non-deterministically translated your messages into gibberish...

I think we'd all throw away such tools, but we are expected to accept it if it's an "AI tool?"

>>ryandr+Yw4
If the only calculators that existed failed at 5% of the calculations, or if the only communication tools miscommunicated 5% of the time, we would still use both all the time. They would be far less than 95% as useful as perfect versions, but drastically better then not having the tools at all.

>>ToValu+XB4
Absolutely not. We'd just do the calculations by hand, which is better than running the 95%-correct calculator and then doing the calculations by hand anyway to verify its output.

>>gitrem+pE4
Suppose you work in a field where getting calculations right is critical. Your engineers make mistakes less than .01% of the time, but they do a lot of calculations and each mistake could cost $millions or lives. Double- and triple-checking help a lot, but they're costly. Here's a machine that verifies 95% of calculations, but you'd still have to do 5% of the work. Shall I throw it away?

Unreliable tools have a good deal of utility. That's an example of them helping reduce the problem space, but they also can be useful in situations where having a 95% confidence guess now matters more that a 99.99% confidence one in ten minutes- firing mortars in active combat, say.

There's situations where validation is easier than computation; canonically this is factoring, but even division is much simpler than multiplication. It could very easily save you time to multiply all of the calculator's output by the dividend while performing both a multiplication and a division for the 5% that are wrong.

edit: I submit this comment and click to go the front page and right at the top is Unsure Calculator (no relevance). Sorry, I had to mention this

>>ToValu+yR4
> you'd still have to do 5% of the work

No, you still have to do 100% of the work.

zlacker