zlacker

The architecture behind the chatGPT and the other AIs that are making the news won't ever improve so it can correctly write non-trivial code. There is a fundamental reason for that.

Other architectures exist, but you can notice from the lack of people talking about them that they don't produce any output nearly as developed as the chatGPT kind. They will get there eventually, but that's not what we are seeing here.

replies(1): >>edanm+T31

>>marcos+(OP)
> The architecture behind the chatGPT and the other AIs that are making the news won't ever improve so it can correctly write non-trivial code. There is a fundamental reason for that.

What is that?

replies(1): >>Curiou+Hw1

>>edanm+T31
Probably because it doesn't maintain long term cohesion. Transformer models are great at producing things that look right over short distances, but as the output length increases it often becomes contradictory or nonsensical.

To get good output on larger scales we're going to need a model that is hierarchical with longer term self attention.