zlacker

[return to "Beyond Semantics: Unreasonable Effectiveness of Reasonless Intermediate Tokens"]
1. throwa+Id[view] [source] 2025-05-23 17:56:08
>>nyrikk+(OP)
why is it unreasonable that giving the llm a spot to think and collate long range attention and summarize without the pressure of building a meaningful next token so quickly would result in higher effectiveness?
◧◩
2. x_flyn+9e1[view] [source] 2025-05-24 04:42:34
>>throwa+Id
It's more about the lack of semantic meaning in the intermediate tokens, not that they aren't effective (even when the intermediates are wrong)
[go to top]