My AI skeptic friends are all nuts

>>tablet+(OP)
>If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.

Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis. 6 months ago if I wasn’t using the life-changing, newest LLM at the time, I was also doing it wrong and being a luddite.

It creates a never ending treadmill of boy-who-cried-LLM. Why should I believe anything outlined in the article is transformative now when all the same vague claims about productivity increases were being made about the LLMs from 6 months ago which we now all agree are bad?

I don’t really know what would actually unseat this epistemic prior at this point for me.

In six months, I predict the author will again think the LLM products of 6 month ago (now) were actually not very useful and didn’t live up to the hype.

>>davidc+K8
name 5 tasks which you think current AIs can't do. then go and spend 30 minutes seeing how current AIs can do on them. write it on a sticky note and put it somewhere that you'll see it.

otherwise, yes, you'll continue to be irritated by AI hype, maybe up until the point where our civilization starts going off the rails

>>anxoo+0l
1. create a working (moderately complex) ghidra script without hallucinating.

Granted I was trying to do this 6 months ago, but maybe a miracle has happened. But I'm the past I had very bad experience with using LLMs for niche things (i.e. things that were never mentioned on stackoverflow)

>>poinca+qr
I've never heard of Ghidra before but, in case you're interested, I ran that prompt through OpenAI's o3 and Anthropic's Claude Opus 4 for you just now (both of them the latest/greatest models from those vendors and new as of less than six months ago) - results here: https://chatgpt.com/share/683e3e38-cfd0-8006-9e49-2aa799dac4... and https://claude.ai/share/7a076ca1-0dee-4b32-9c82-8a5fd3beb967

I have no way of evaluating these myself so they might just be garbage slop.

>>simonw+ds
The first one doesn't seem to actually give me the script, so I can't test it.

The second one didn't work for me without some code modification (specifically, the "count code blocks" didn't work), but the results were... not impressive.

It starts by ignoring every function that begins with "FUN_" on the basis that it's "# Skip compiler-generated functions (optional)". Sorry, but those functions aren't compiler-generated functions, they're functions that lack any symbol names, which in ghidra terms, is pretty damn common if you're reverse engineering unsymbolized code. If anything, it's the opposite of what you would want, because the named functions are the ones I've already looked at and thus give less of a guideline for interesting ones to look into next.

Looking at the results at a project I had open, it's supposed to be skipping external functions, but virtually all the top xrefs are external functions.

Finally, as a "moderately complex" script... it's not a good example. The only thing that approaches that complexity is trying to count basic blocks in a function--something that actually engages with the code model of Ghidra--but that part is broken, and I don't know Ghidra well enough to fix it. Something that would be more along the lines of "moderately complex" to me would be (to use a use case I actually have right now) for example turning the constant into a reference to that offset in the assumed data segment. Or finding all the switch statements that ghidra failed to decompile!

zlacker