zlacker

[parent] [thread] 0 comments
1. sigmoi+(OP)[view] [source] 2024-02-14 07:33:03
I had forgotten it's existence by now, but I remember reading this post all those years back. Damn. I also remember thinking that this would be so cool if RNNs didn't suck at long contexts, even with an attention mechanism. In some sense, the only thing he needed was the transformer architecture and a "fuck, let's just do it" compute budget to end up at ChatGPT. He was always at the frontier of this field.
[go to top]