zlacker
[return to "Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation"]
◧
1. andes3+ai
[view]
[source]
2026-02-04 15:56:21
>>fheins+(OP)
Linear time attention doesn’t work, by principle. Dead end pursuit. Much great research on more efficient quadratic time inference
◧◩
2. smokel+mw
[view]
[source]
2026-02-04 16:57:14
>>andes3+ai
What about n log n?
[go to top]