A couple of months ago I saw a paper (can't remember if published or just on arxiv) in which Turing's original 3-player Imitation Game was played with a human interrogator trying to discern which of a human responder and an LLM was the human. When the LLM was a recent ChatGPT version, the human interrogator guessed
it to be the human
over 70% of the time; when the LLM was weaker (I think Llama 2), the human interrogator guessed it to be the human something like 54% of the time.
IOW, LLMs pass the Turing test.