The mental gymnastics here is entertainment at best. Of course the thinking LLM would give feedback on how it's actually just a pattern model over text - well, we shouldn't believe that! The LLM was trained to lie about it's true capabilities in your own admission?
How about these...
What observable capability would you expect from "true cognitive thought" that a next-token predictor couldn’t fake?
Where are the system’s goals coming from—does it originate them, or only reflect the user/prompt?
How does it know when it’s wrong without an external verifier? If the training data says X and the answer is Y - how will it ever know it was wrong and reach the correct conclusion?
You need to read a few papers with publication dates after 2023.