zlacker

[parent] [thread] 1 comments
1. calf+(OP)[view] [source] 2023-11-18 06:10:31
Did Ilya give a reason why transformers are theoretically sufficient? I've watched him talk in a CS seminar and he's certainly interesting to listen to.
replies(1): >>kolja0+O7
2. kolja0+O7[view] [source] 2023-11-18 07:26:46
>>calf+(OP)
From the interviews with him that I have seen, Sutskever thinks that language model is a sufficient pretraining task because there is a great deal of reasoning involved in next token prediction. The example he used was that suppose you fed a murder mystery novel to a language model and then prompted it with the phrase "The person who committed the model was: ". The model would unquestionably need to reason in order to come to the right conclusion, but at the same time it is just predicting the next token.
[go to top]