Advancing AI Benchmarking with Game Arena

>>salkah+(OP)
Wow. I'm generally in the AI maximalist camp. But adding Werewolf feels dangerous to me. Anyone who's played knows lying, deceipt, and manipulation is often key to winning. We really want models climbing this benchmark?

>>bennyf+Nh
Oddly in the highlighted game I watched the werewolf simply gives up in the last round and says I'm the werewolf well-done... Vote me.

Bizarre.

>>rustyh+sC
This is a legitimate strategy for the werewolf, no?

>>miniha+gf1
Probably not in this case.

There were two villagers and one werewolf. The werewolf started the round by saying I'm the werewolf vote for me and then the game ended with a villager win.

Over night he had successfully taken out the doctor. It made no sense in my opinion.

There were some funny bits like on of the Anthropics models forgetting a rule and leading to everyone accusing him of being a werewolf in a pile on. He wasn't a werewolf he genuinely forgot the rule. Happens nearly every human game of werewolf.

zlacker