We have agents implement agents that play games against each other- so Claude isn't playing against GPT, but an agent written by Claude plays poker against an agent written by GPT, and this really tough task leads to very interesting findings on AI for coding.
That was a whole half a decade ago, but back then deep learning AIs were defeated very badly by handcrafted scripts. Even the best bot in the neural net category was actual a symbolic script/neural net hybrid.
Gemini is consistently winning against top models
As someone who's been playing dota for nearly 20 years now, it was fascinating to watch it play. Some of it's decision making process didn't seem logical in the short term, but would often be set ups for future plays, even though their observation window was fairly small. Even more impressively was the ai bot changed the meta of professional players, since tactics that arose out of its training ended up being more optimal.
I wish we got to the point where other ai bots were out there, but it's entirely understandable that you couldn't drive a complex game like Dota with LLMs, whereas you can with the ones the Game Arena has selected.