I saw a comment (that I can’t find now) wondering if Sam might have been fired for copyright reasons. Pretty much all the big corpuses that are used in LLM training contain copyrighted material, but that’s not a surprise and I really don’t think they’d kick him out over that. But what if he had a team of people deliberately adding a ton of copyrighted material - books, movies, etc - to the training data for ChatGPT? It feels like it might fit the shape of the situation.
Speculations about these source materials can be traced back as far as 2020: https://twitter.com/theshawwn/status/1320282152689336320
I don't think this issue would've flown under the radar for so long, especially with the implication that Ilya sided with the rest of the board to vote against Sam and Greg.