Are we able to prove it with output that's
1) algorithmically novel (not just a recombination)
2) coherent, and
3) not explainable by training data coverage.
No handwaving with scale...
I want LLMs to create, but so far, every creative output I’ve seen is just a clever remix of training data. The most advanced models still fail a simple test: Restrict the domain, for example, "invent a cookie recipe with no flour, sugar, or eggs" or "name a company without using real words". Suddenly, their creativity collapses into either, nonsense (violating constraints), or trivial recombination, ChocoNutBake instead of NutellaCookie.
If LLMs could actually create, we’d see emergent novelty, outputs that couldn’t exist in the training data. Instead, we get constrained interpolation.
Happy to be proven wrong. Would like to see examples where an LLM output is impossible to map back to its training data.
The demand for outputs that are provably untraceable to training data feels like asking for magic, not creativity. Even Gödel didn’t require “never seen before atoms” to demonstrate emergence.