zlacker

[return to "Beyond Semantics: Unreasonable Effectiveness of Reasonless Intermediate Tokens"]

>>nyrikk+(OP)
So is the interpretation here something like “CoT tokens are actually neuraleese”? They do boost performance, so the model must be stashing some intermediate reasoning outputs there. But perhaps not using the literal human meaning of those tokens?

>>thepti+FA
Exactly, the traces lack semantics and shouldn't be anthropomorphized. (I'm one of the students in the lab that wrote this, but not one of the authors)

>>x_flyn+2e1
Thanks! So, how does this impact Deliberative Alignment[1], where IIUC the intermediate tokens are assessed (eg for referencing the appropriate policy fragment)?

Does you see your result as putting that paradigm in question, or does the explicit reasoning assessment perhaps ameliorate the issue?

[1]: https://arxiv.org/html/2412.16339v2

[go to top]