Chomsky on what ChatGPT is good for (2023)

>>mef+(OP)
All this interview proves is that Chomsky has fallen far, far behind how AI systems work today and is retreating to scoff at all the progress machine learning has achieved. Machine learning has given rise to AI now. It can't explain itself from principles or its architecture. But you couldn't explain your brain from principles or its architecture, you'd need all of neuroscience to do it. Because the brain is digital and (probably) does not reason like our brains do, it somehow falls short?

While there's some things in this I find myself nodding along to in this, I can't help but feel it's an a really old take that is super vague and hand-wavy. The truth is that all of the progress on machine learning is absolutely science. We understand extremely well how to make neural networks learn efficiently; it's why the data leads anywhere at all. Backpropagation and gradient descent are extraordinarily powerful. Not to mention all the "just engineering" of making chips crunch incredible amounts of numbers.

Chomsky is extremely ungenerous to the progress and also pretty flippant about what this stuff can do.

I think we should probably stop listening to Chomsky; he hasn't said anything here that he hasn't already say a thousand times for decades.

>>titzer+S5
> Not to mention all the "just engineering" of making chips crunch incredible amounts of numbers.

Are LLM's still the same black box as they were described as a couple years ago? Are their inner workings at least slightly better understood than in the past?

Running tens of thousands of chips crunching a bajillion numbers a second sounds fun, but that's not automatically "engineering". You can have the same chips crunching numbers with the same intensity just to run an algorithm to run a large prime number. Chips crunching numbers isn't automatically engineering IMO. More like a side effect of engineering? Or a tool you use to run the thing you built?

What happens when we build something that works, but we don't actually know how? We learn about it through trial and error, rather than foundational logic about the technology.

Sorta reminds me of the human brain, psychology, and how some people think psychology isn't science. The brain is a black box kind of like a LLM? Some people will think it's still science, others will have less respect.

This perspective might be off base. It's under the assumption that we all agree LLM's are a poorly understood black box and no one really knows how they truly work. I could be completely wrong on that, would love for someone else to weigh in.

Separately, I don't know the author, but agreed it reads more like a pop sci book. Although I only hope to write as coherently as that when I'm 96 y/o.

>>cj+S8
> Running tens of thousands of chips crunching a bajillion numbers a second sounds fun, but that's not automatically "engineering".

Not if some properties are unexpectedly emergent. Then it is science. For instance, why should a generic statistical model be able to learn how to fill in blanks in text using a finite number of samples? And why should a generic blank-filler be able to produce a coherent chat bot that can even help you write code?

Some have even claimed that statistical modelling shouldn't able to produce coherent speech, because it would need impossible amounts of data, or the optimisation problem might be too hard, or because of Goedel's incompleteness theorem somehow implying that human-level intelligence is uncomputable, etc. The fact that we have a talking robot means that those people were wrong. That should count as a scientific breakthrough.

>>ogogma+Hb
> because it would need impossible amounts of data

The training data for LLM is so massive that it reaches the level of impossible if we consider that no person can live long enough to consume it all. Or even a small percent of it.

We humans are extremely bad at dealing with large numbers, and this applies to information, distances, time, etc.

>>Shorel+8d
Your final remark sounds condescending. Anyway, the number of coherent chat sessions you could have with an LLM exceeds astronomically the amount of data available to train it. How is that even possible?

zlacker