So at this point it does not matter what you believe about LLMs: in general, to trust LeCun words is not a good idea. Add to this that LeCun is directing an AI lab that as the same point has the following huge issues:
1. Weakest ever LLM among the big labs with similar resources (and smaller resources: DeepSeek).
2. They say they are focusing on open source models, but the license is among the less open than the available open weight models.
3. LLMs and in general all the new AI wave puts CNNs, a field where LeCun worked (but that didn't started himself) a lot more in perspective, and now it's just a chapter in a book that is composed mostly of other techniques.
Btw, other researchers that were in the LeCun side, changed side recently, saying that now "is different" because of CoT, that is the symbolic reasoning they were blabling before. But CoT is stil regressive next token without any architectural change, so, no, they were wrong, too.
Are we able to prove it with output that's
1) algorithmically novel (not just a recombination)
2) coherent, and
3) not explainable by training data coverage.
No handwaving with scale...
I want LLMs to create, but so far, every creative output I’ve seen is just a clever remix of training data. The most advanced models still fail a simple test: Restrict the domain, for example, "invent a cookie recipe with no flour, sugar, or eggs" or "name a company without using real words". Suddenly, their creativity collapses into either, nonsense (violating constraints), or trivial recombination, ChocoNutBake instead of NutellaCookie.
If LLMs could actually create, we’d see emergent novelty, outputs that couldn’t exist in the training data. Instead, we get constrained interpolation.
Happy to be proven wrong. Would like to see examples where an LLM output is impossible to map back to its training data.