ChatGPT (I've not got v4) deliberately fails the test by spewing out "as a large language model…", but also fails incidentally by having an attention span similar to my mother's shortly after her dementia diagnosis.
The problem with 3.5 is that it's simultaneously not mastered anything, and yet also beats everyone in whatever they've not mastered — an extremely drunk 50,000 year old Sherlock Holmes who speaks every language and has read every book just isn't going to pass itself off as Max Musstermann in a blind hour-long trial.
On the one hand, what I was saying here was more about the Turing Test than about AGI. Sometimes it gets called the AGI, sometimes it's "autocomplete on steroids", but even if it is fancy autocomplete, I think 3.5 has the skill to pass a short Turing Test, but not the personality, and it needs a longer "short-term memory"-equivalent than 3.5 for a full Turing Test.
On the other hand, as I (sadly) don't get paid to create LLMs, I've only got the kind of superficial awareness of how they work that comes from podcasts and the occasional blog post, which means ChatGPT might very well understand ChatGPT better than I do.
Can GPT-[3.5, 4] be prompted to make itself?