zlacker
[parent]
[thread]
4 comments
1. mounta+(OP)
[view]
[source]
2025-06-02 23:18:51
Almost every single major benchmark, and yes progress is incremental but it adds up, this has always been the case
replies(1):
>>attemp+TG
◧
2. attemp+TG
[view]
[source]
2025-06-03 06:35:10
>>mounta+(OP)
We were talking about linear improvements and I have yet to see it
replies(1):
>>mounta+Q12
◧◩
3. mounta+Q12
[view]
[source]
[discussion]
2025-06-03 17:04:48
>>attemp+TG
check the benchmarks or make one of your own
replies(1):
>>attemp+qz2
◧◩◪
4. attemp+qz2
[view]
[source]
[discussion]
2025-06-03 20:20:04
>>mounta+Q12
I checked the BlEU-Score and Perplexity of popular models and both have stagnated around 2021. As a disclaimer this was a cursory check and I didn't dive into the details of how individuals scores were evaluated.
replies(1):
>>mounta+RJ4
◧◩◪◨
5. mounta+RJ4
[view]
[source]
[discussion]
2025-06-04 16:37:03
>>attemp+qz2
on what benchmarks? pretty much every major one is linear improvement
[go to top]