zlacker

[parent] [thread] 4 comments
1. mounta+(OP)[view] [source] 2025-06-02 23:18:51
Almost every single major benchmark, and yes progress is incremental but it adds up, this has always been the case
replies(1): >>attemp+TG
2. attemp+TG[view] [source] 2025-06-03 06:35:10
>>mounta+(OP)
We were talking about linear improvements and I have yet to see it
replies(1): >>mounta+Q12
◧◩
3. mounta+Q12[view] [source] [discussion] 2025-06-03 17:04:48
>>attemp+TG
check the benchmarks or make one of your own
replies(1): >>attemp+qz2
◧◩◪
4. attemp+qz2[view] [source] [discussion] 2025-06-03 20:20:04
>>mounta+Q12
I checked the BlEU-Score and Perplexity of popular models and both have stagnated around 2021. As a disclaimer this was a cursory check and I didn't dive into the details of how individuals scores were evaluated.
replies(1): >>mounta+RJ4
◧◩◪◨
5. mounta+RJ4[view] [source] [discussion] 2025-06-04 16:37:03
>>attemp+qz2
on what benchmarks? pretty much every major one is linear improvement
[go to top]