zlacker
[return to "Advancing AI Benchmarking with Game Arena"]
◧
1. 10xDev+td
[view]
[source]
2026-02-02 18:54:06
>>salkah+(OP)
If AI can program, why does it matter if it can play Chess using CoT when it can program a Chess Engine instead? This applies to other domains as well.
◧◩
2. simian+tj
[view]
[source]
2026-02-02 19:25:29
>>10xDev+td
Its the same reason we are asked to write exams without using calculators but the real world does have them.
How you work without calculators is a proxy for real world competency.
◧◩◪
3. 10xDev+Gk
[view]
[source]
2026-02-02 19:31:35
>>simian+tj
Funny, you used probably the most useless form of benchmarking used on people as an example of measuring "competency" in the real world.
[go to top]