>>nyrikk+(OP)
So is the interpretation here something like “CoT tokens are actually neuraleese”? They do boost performance, so the model must be stashing some intermediate reasoning outputs there. But perhaps not using the literal human meaning of those tokens?