zlacker

[parent] [thread] 7 comments
1. terhec+(OP)[view] [source] 2025-05-22 08:11:42
Can you give some examples where it didn't work for you? I'm curious because I derive a lot of value from it and my guess is that we're trying very different things with it.
replies(4): >>exe34+g2 >>wazoox+J7 >>Tade0+1H >>th0ma5+Pt1
2. exe34+g2[view] [source] 2025-05-22 08:32:50
>>terhec+(OP)
From my experience so far, most "AI skeptics" seem to be trying to catch the LLM in an error of reasoning or asking it to turn a vague description into a polished product in one shot. To make the latter worse, they often try to add context after the first wrong answer, which tends to make the LLM continue to be wrong - stop thinking about the pink elephant. No, I said don't think about the pink elephant! Why do you keep mentioning the pink elephant? I said I don't want a pink elephant in the text!
3. wazoox+J7[view] [source] 2025-05-22 09:27:18
>>terhec+(OP)
Not OP, but yesterday I was working on NFS server tuning on Linux, a typically quite difficult thing to find relevant info about through search engines. I asked Claude 3.5 to suggest some kernel settings or compile-time tweaks, and it provided me with entirely made up answers about kernel variables that don't exist, and makefile options that don't exist.

So maybe another LLM would have fared better, but still, so far it's mostly wasted time. It works quite well to summarise texts and creating filler images, but overall I still find them not reliable enough to care out of these two limited use cases.

replies(1): >>Yiin+J9
◧◩
4. Yiin+J9[view] [source] [discussion] 2025-05-22 09:56:16
>>wazoox+J7
I mean you answered yourself why it didn't work, if there is no useful data in its training corpus, it would be a miracle if it could correctly guess unknown information.
replies(2): >>rndmio+qc >>wazoox+mF
◧◩◪
5. rndmio+qc[view] [source] [discussion] 2025-05-22 10:28:40
>>Yiin+J9
How are you supposed to know in advance if it is going to be able to usefully answer your question or will just make up something?
◧◩◪
6. wazoox+mF[view] [source] [discussion] 2025-05-22 14:30:08
>>Yiin+J9
The data is certainly available, both in Linux kernel source and LKML history. The answers looked perfect at first glance; anyone without prior knowledge of kernel compilation and patching would have probably be impressed by the technical details in the answer. That's the typical LLM failure mode : it provides an answer when search engines fail you (because they provide you only the most basic, generic NFS-related forum posts while I was looking for strong technical information in a high-performance environment), but this answer isn't much better (even after pointing out the error), but would fool most people....
7. Tade0+1H[view] [source] 2025-05-22 14:38:55
>>terhec+(OP)
Not OP, but here's one instance over which I already had an internet fistfight with a person swearing by LLMs[0], meaning it should serve as a decent example:

> Suppose I'm standing on Earth and suddenly gravity stopped affecting me. What would be my trajectory? Specifically what would be my distance from Earth over time?

https://chatgpt.com/c/682edff8-c540-8010-acaa-8d9b5c26733d

It gives the "small distance approximation" in the examples, even if I ask for the solution after two hours, which at 879km is already quite off the correct ~820km.

An approximation that is better in the order of seconds to hours is pretty simple:

  s(t) = sqrt((R^2 + (Vt)^2)) - R
And it's even plotted in the chart, but again - numbers are off.

[0] Their results were giving wildly incorrect numbers at less than 100 seconds already, which was what originally prompted me to respond - they didn't even match the formula.

8. th0ma5+Pt1[view] [source] 2025-05-22 19:06:19
>>terhec+(OP)
Every single thing that I am interested in my computer doing has not been done before by anyone that I know of nor do I think anyone besides myself would care.

I keep reading all of this glazing like in the rest of the thread and it's really frustrating because you get this fatigue with all the bs coming out of them that makes you not want to use them at all. The more you try to get it to fix the output the more it uses unrelated tokens.

In just the last 24 hours I've seen multiple models:

- Put C++ code structures in Python - Synthesize non existent functions, libraries, features of programming languages - Whole features of video file formats and associated ffmpeg flags that aren't applicable to imagery.

I also think you're not going to get any good answers to this question and a lot of pro AI people are going to be left unsatisfied because when you get into this spot every single thing that it does is wrong in some new way that cannot be easily categorized.

It is literally the limit of the representation of information in a digital way.

[go to top]