zlacker

[parent] [thread] 7 comments
1. senord+(OP)[view] [source] 2026-01-01 19:23:30
I’ve been reading this comment multiple times a week for the last couple years. Constant assertions that we’re starting to hit limits, plateau, etc. But a cursory glance at where we are today vs a year ago, let alone two years ago, makes it wildly obvious that this is bullshit. The pace of improvement of both models and tooling has been breathtaking. I could give a shit whether you think it’s “exponential”, people like you were dismissing all of this years ago, meanwhile I just keep getting more and more productive.
replies(1): >>qualif+Ok
2. qualif+Ok[view] [source] 2026-01-01 21:49:15
>>senord+(OP)
People keep saying stuff like this. That the improvements are so obvious and breathtaking and astronomical and then I go check out the frontier LLMs again and they're maybe a tiny bit better than they were last year but I can't actually be sure bcuz it's hard to tell.

sometimes it seems like people are just living in another timeline.

replies(3): >>senord+mQ >>jennyh+r22 >>aspenm+993
◧◩
3. senord+mQ[view] [source] [discussion] 2026-01-02 01:35:04
>>qualif+Ok
I’m genuinely curious what your “checking the frontier LLMs” looks like, especially if you haven’t used AI since last year.
◧◩
4. jennyh+r22[view] [source] [discussion] 2026-01-02 14:20:32
>>qualif+Ok
"maybe a tiny bit better" is what you say when you've been tricked by snake oil salesman

This shit has gotten worse since 2023.

replies(1): >>aspenm+n93
◧◩
5. aspenm+993[view] [source] [discussion] 2026-01-02 20:54:59
>>qualif+Ok
You might want to be more specific because benchmarks abound and they paint a pretty consistent picture. LMArena "vibes" paint another picture. I don't know what you are doing to "check" the frontier LLMs but whatever you're doing doesn't seem to match more careful measurement...

You don't actually have to take peoples word for it, read epoch.ai developments, look into the benchmark literature, look at ARC-AGI...

replies(1): >>qualif+Djb
◧◩◪
6. aspenm+n93[view] [source] [discussion] 2026-01-02 20:56:47
>>jennyh+r22
> This shit has gotten worse since 2023.

I would really appreciate it if people could be specific when they say stuff like this because it's so crazy out of line with all measurement efforts. There are an insane amount of serious problems with current LLM / agentic paradigms, but the idea that things have gotten worse since 2023? I mean come on.

replies(1): >>senord+0P3
◧◩◪◨
7. senord+0P3[view] [source] [discussion] 2026-01-03 01:34:18
>>aspenm+n93
You’re responding to a troll who just has a nasty, bitter axe to grind against AI. It’s honestly pretty sad and pathetic.
◧◩◪
8. qualif+Djb[view] [source] [discussion] 2026-01-05 16:20:10
>>aspenm+993
That's half the problem though. I can see benchmarks. I can see number go up on some chart or that the AI scores higher on some niche math or programming test, but those results don't seem to actually connect much to meaningful improvements in daily usage of the software when those updates hit the public.

That's where the skepticism comes in, because one side of the discussion is hyping up exponential growth and the other is seeing something that looks more logarithmic instead.

I realize anecdotes aren't as useful as numbers for this kind of analysis, but there's such a wide gap between what people are observing in practice and what the tests and metrics are showing it's hard not to wonder about those numbers.

[go to top]