zlacker

[parent] [thread] 2 comments
1. nayroc+(OP)[view] [source] 2026-02-03 01:28:59
The models they tested are already way behind the current state-of-the-art. Would be interesting to see if their results hold up when repeated with the latest frontier models.
replies(1): >>Stiles+Kr1
2. Stiles+Kr1[view] [source] 2026-02-03 13:20:43
>>nayroc+(OP)
I think we have all seen the latest models turn into a hot mess.
replies(1): >>louier+et2
◧◩
3. louier+et2[view] [source] [discussion] 2026-02-03 18:01:07
>>Stiles+Kr1
i interpret figure 2 as showing that incoherence increases with model gens, albeit on a small sample size
[go to top]