zlacker

[parent] [thread] 0 comments
1. woadwa+(OP)[view] [source] 2025-05-23 18:17:48
> Just because your sampler picks a sequence of tokens that contain incorrect reasoning doesn't mean a useful reasoning trace isn’t also contained within the latent space.

That's essentially the core idea in Coconut[1][2], to keep the reasoning traces in a continuous space.

[1]: https://arxiv.org/abs/2412.06769

[2]: https://github.com/facebookresearch/coconut

[go to top]