zlacker

I read that post recently and it felt prescient to someone who has not been deeply involved in ML

Even the HN discussion around this had comments like "this feels my baby learning to speak.." which are the same comparisons people were saying when LLMs hit mainstream in 2022

replies(1): >>sigmoi+c1

>>jatins+(OP)
I had forgotten it's existence by now, but I remember reading this post all those years back. Damn. I also remember thinking that this would be so cool if RNNs didn't suck at long contexts, even with an attention mechanism. In some sense, the only thing he needed was the transformer architecture and a "fuck, let's just do it" compute budget to end up at ChatGPT. He was always at the frontier of this field.