zlacker

[parent] [thread] 2 comments
1. namari+(OP)[view] [source] 2025-05-07 11:26:08
On Limitations of the Transformer Architecture https://arxiv.org/abs/2402.08164

Theoretical limitations of multi-layer Transformer https://arxiv.org/abs/2412.02975

replies(1): >>sweezy+Z1
2. sweezy+Z1[view] [source] 2025-05-07 11:46:15
>>namari+(OP)
Only skimmed, but both seem to be referring to what transformers can do in a single forward pass, reasoning models would clearly be a way around that limitation.

o4 has no problem with the examples of the first paper (appendix A). You can see its reasoning here is also sound: https://chatgpt.com/share/681b468c-3e80-8002-bafe-279bbe9e18.... Not conclusive unfortunately since this is in date-range of its training data. Reasoning models killed off a large class of "easy logic errors" people discovered from the earlier generations though.

replies(1): >>namari+t0c
◧◩
3. namari+t0c[view] [source] [discussion] 2025-05-12 07:34:21
>>sweezy+Z1
Your unwillingness to engage with the limitations of the technology explains a lot of the current hype.
[go to top]