zlacker

Did Ilya give a reason why transformers are theoretically sufficient? I've watched him talk in a CS seminar and he's certainly interesting to listen to.

replies(1): >>kolja0+O7

>>calf+(OP)
From the interviews with him that I have seen, Sutskever thinks that language model is a sufficient pretraining task because there is a great deal of reasoning involved in next token prediction. The example he used was that suppose you fed a murder mystery novel to a language model and then prompted it with the phrase "The person who committed the model was: ". The model would unquestionably need to reason in order to come to the right conclusion, but at the same time it is just predicting the next token.