zlacker

[return to "Advancing AI Benchmarking with Game Arena"]
1. 10xDev+td[view] [source] 2026-02-02 18:54:06
>>salkah+(OP)
If AI can program, why does it matter if it can play Chess using CoT when it can program a Chess Engine instead? This applies to other domains as well.
◧◩
2. Rivier+OE[view] [source] 2026-02-02 20:54:49
>>10xDev+td
It can write a chess engine because it has read the code of a thousand of chess engines. This benchmark measures a different aspect of intelligence.

And as a poker player, I can say that this game is much more challenging for computers than chess, writing a program that can play poker really well and efficiently is an unsolved problem.

◧◩◪
3. marksi+OI1[view] [source] 2026-02-03 02:13:43
>>Rivier+OE
The most popular form was solved in 2019: https://en.wikipedia.org/wiki/Pluribus_(poker_bot)
◧◩◪◨
4. Rivier+VH2[view] [source] 2026-02-03 10:47:49
>>marksi+OI1
Pluribus didn't solve poker. It's limited to fixed starting stack sizes. It can't exploit weak opponents, it tries to approach a Nash equilibrium, but in multiplayer poker, Nash equilibrium doesn't have the theoretical guarantees it does in head's up. And lastly, it requires a ton of compute.
[go to top]