zlacker

[parent] [thread] 2 comments
1. naaski+(OP)[view] [source] 2026-02-04 16:19:31
I think any kind of innovation here will have to take advantage of some structure inherent to the problem, like eliminating attention in favour of geometric structures like Grassman flows [1].

[1] Attention Is Not What You Need, https://arxiv.org/abs/2512.19428

replies(1): >>findal+T8
2. findal+T8[view] [source] 2026-02-04 16:57:06
>>naaski+(OP)
Right - e.g., if you're modeling a physical system it makes sense to bake in some physics - like symmetry.
replies(1): >>naaski+eh
◧◩
3. naaski+eh[view] [source] [discussion] 2026-02-04 17:34:38
>>findal+T8
Indeed, and I think natural language and reasoning will have some kind of geometric properties as well. Attention is just a sledgehammer that lets us brute force our way around not understanding that structure well. I think the next step change in AI/LLM abilities will be exploiting this geometry somehow [1,2].

[1] GrokAlign: Geometric Characterisation and Acceleration of Grokking, https://arxiv.org/abs/2510.09782

[2] The Geometry of Reasoning: Flowing Logics in Representation Space, https://arxiv.org/abs/2506.12284

[go to top]