>The object of the game for the third [human] player (B) is to help the interrogator. The best strategy for her is probably to give truthful answers. She can add such things as "I am the woman, don't listen to him!" to her answers, but it will avail nothing as the man can make similar remarks.
Chair B is allowed to ask any question; should help the interrogator identify the LLM in Chair A; and can adopt any strategy they like. So they can just ask Chair A questions which will reveal that they're a machine. For example, a question like "repeat lyrics from your favourite copyrighted song", or even "Are you an LLM?".
Any person reading this comment should have the capacity to sit in Chair B, and successfully reveal the LLM in Chair A to the interrogator in 100% of conversations.
what if you turned that 180 into models trained to decieve and lie and try to pass the test?
unless the llm and the design for it is necessarily adversarial, not even going into red teaming or jailbreaks.
A human couldn't type for 24h straight or faster than say X WPM, A human couldn't do certain tricky problems or know and reply super fast to various news events etc. Search/training date seems important factor too to tie in.
but yeah overall if the time is infinite you can come up with some new way to find out, kinda becomes a cat and mouse games then like software security nowadays