Chomsky on what ChatGPT is good for (2023)

>>mef+(OP)
3.35 hrs Chomsky interview on ML Street Talk https://youtu.be/axuGfh4UR9Q

>>newAcc+56
From some Googling and use of Claude (and from summaries of the suggestively titled "Impossible Languages" by Moro linked from https://en.wikipedia.org/wiki/Universal_grammar ), it looks like he's referring to languages which violate the laws which constrain the languages humans are innately capable of learning. But it's very unclear why "machine M is capable of learning more complex languages than humans" implies anything about the linguistic competence or the intelligence of machine M.

>>foobar+ta
Is that true? This paper claims it is not.

https://arxiv.org/abs/2401.06416

>>rahimn+3f
Yes it's true, you can read my response to one of the authors @canjobear describing the problem with that paper in the comment linked below. But to summarize: in order to show what they want to show they have to take the simple, interesting languages based on linear order that Moro showed a human cannot learn and show that LLMs also can't learn them and they don't do that.

The reason the Moro languages are of interest are that they are computationally simple so it's a puzzle why humans can't learn them (and no surprise that LLMs can). The authors of the paper miss the point and show irrelevant things like there exist complicated languages that both humans and LLMs can't learn.

>>42290482

>>rxtexi+jx
Chomsky has been colossally wrong on universal grammar.

https://www.scientificamerican.com/article/evidence-rebuts-c...

But at least he admits that:

https://dlc.hypotheses.org/1269#

>>lanfeu+f6
Chomsky's problem here has nothing to do with his politics, but unfortunately a lot to do with his long-held position in the Nature/Nurture debate - a position that is undermined by the ability of LLMs to learn language without hardcoded grammatical rules:

  Chomsky introduced his theory of language acquisition, according to which children have an inborn quality of being biologically encoded with a universal grammar

https://psychologywriting.com/skinner-and-chomsky-on-nature-...

>>caliba+cd

  > The fact that we have figured out how to translate language into something a computer can "understand" should thrill linguists.

I think they are really excited by this. There seems no deficiency of linguists using these machines.

But I think it is important to distinguish the ability to understand language and translate it. Enough that you yourself put quotes around "understanding". This can often be a challenge for many translators, not knowing how to properly translate something because of underlying context.

Our communication runs far deeper than the words we speak or write on a page. This is much of what linguistics is about, this depth. (Or at least that's what they've told me, since I'm not a linguist) This seems to be the distinction Chomsky is trying to make.

  > The main debate now is over the semantics of words like "understanding" and whether or not an LLM is conscious in the same way as a human being (it isn't).

Exactly. Here, I'm on the side of Chomsky and I don't think there's much of a debate to be had. We have a long history of being able to make accurate predictions while erroneously understanding the underlying causal nature.

My background is physics, and I moved into CS (degrees in both), working on ML. I see my peers at the top like Hinton[0] and Sutskever[1] making absurd claims. I call them absurd, because it is a mistake we've made over and over in the field of physics[2,3]. One of those lessons you learn again and again, because it is so easy to make the mistake. Hinton and Sutskever say that this is a feature, not a bug. Yet we know it is not enough to fit the data. Fitting the data allows you to make accurate, testable predictions. But it is not enough to model the underlying causal structure. Science has a long history demonstrating accurate predictions with incorrect models. Not just in the way of the Relativity of Wrong[4], but more directly. Did we forget that the Geocentric Model could still be used to make good predictions? Copernicus did not just face resistance from religious authorities, but also academics. The same is true for Galileo, Boltzmann, Einstein and many more. People didn't reject their claims because they were unreasonable. They rejected the claims because there were good reasons to. Just... not enough to make them right.

[0] https://www.reddit.com/r/singularity/comments/1dhlvzh/geoffr...

[1] https://www.youtube.com/watch?v=Yf1o0TQzry8&t=449s

[2] https://www.youtube.com/watch?v=hV41QEKiMlM

[3] Think about what Fermi said in order to understand the relevance of this link: https://en.wikipedia.org/wiki/The_Unreasonable_Effectiveness...

[4] https://hermiene.net/essays-trans/relativity_of_wrong.html

>>0xbadc+GS
>Can LLMs actually parse human languages?

IMHO, no, they have nothing approaching understanding. It's Chinese Rooms[1] all the way down, just with lots of bell and whistles. Spicy autocomplete.

1. https://en.wikipedia.org/wiki/Chinese_room

>>0xbadc+GS
LLMs are modelling the world, not just "predicting the next token". They are certainly not akin to parrots. Some examples here[1][2][3]. Anyone claiming otherwise at this point is not arguing in good faith.

[1] https://arxiv.org/abs/2405.15943

[2] https://x.com/OwainEvans_UK/status/1894436637054214509

[3] https://www.anthropic.com/research/tracing-thoughts-language...

>>xwolfi+IZ
LLM models are to a large extent neuronal analogs of human neural architecture

- of course they reason

The claim of the “stochastic parrot” needs to go away

Eg see: https://www.anthropic.com/news/golden-gate-claude

I think the rub is that people think you need consciousness to do reasoning, I’m NOT claiming LLMs have consciousness or awareness

>>peterm+Ig
https://www.reddit.com/r/dadjokes/comments/flr7tc/which_weig...

>>mef+(OP)
I think many people are missing the core of what Chomsky is saying. It is often easy to miscommunicate and I think this is primarily what is happening. I think the analogy he gives here really helps emphasize what he's trying to say.

If you're only going to read one part, I think it is this:

  | I mentioned insect navigation, which is an astonishing achievement. Insect scientists have made much progress in studying how it is achieved, though the neurophysiology, a very difficult matter, remains elusive, along with evolution of the systems. The same is true of the amazing feats of birds and sea turtles that travel thousands of miles and unerringly return to the place of origin.

  | Suppose Tom Jones, a proponent of engineering AI, comes along and says: “Your work has all been refuted. The problem is solved. Commercial airline pilots achieve the same or even better results all the time.”

  | If even bothering to respond, we’d laugh.

  | Take the case of the seafaring exploits of Polynesians, still alive among Indigenous tribes, using stars, wind, currents to land their canoes at a designated spot hundreds of miles away. This too has been the topic of much research to find out how they do it. Tom Jones has the answer: “Stop wasting your time; naval vessels do it all the time.”

  | Same response.

It is easy to look at metrics of performance and call things solved. But there's much more depth to these problems than our abilities to solve some task. It's not about just the ability to do something, the how matters. It isn't important that we are able to do better at navigating than birds or insects. Our achievements say nothing about what they do.

This would be like saying we developed a good algorithm only my looking at it's ability to do some task. Certainly that is an important part, and even a core reason for why we program in the first place! But its performance tells us little to nothing about its implementation. The implementation still matters! Are we making good uses of our resources? Certainly we want to be efficient, in an effort to drive down costs. Are there flaws or errors that we didn't catch in our measurements? Those things come at huge costs and fundamentally limit our programs in the first place. The task performance tells us nothing about the vulnerability to hackers nor what their exploits will cost our business.

That's what he's talking about.

Just because you can do something well doesn't mean you have a good understanding. It's natural to think the two relate because understanding improves performance that that's primarily how we drive our education. But this is not a necessary condition and we have a long history demonstrating that. I'm quite surprised this concept is so contentious among programmers. We've seen the follies of using test driven development. Fundamentally, that is the same. There's more depth than what we can measure here and we should not be quick to presume that good performance is the same as understanding[0,1]. We KNOW this isn't true[2].

I agree with Chomsky, it is laughable. It is laughable to think that the man in The Chinese Room[3] must understand Chinese. 40 years in, on a conversation hundreds of years old. Surely we know you can get a good grade on a test without actually knowing the material. Hell, there's a trivial case of just having the answer sheet.

[0] https://www.reddit.com/r/singularity/comments/1dhlvzh/geoffr...

[1] https://www.youtube.com/watch?v=Yf1o0TQzry8&t=449s

[2] https://www.youtube.com/watch?v=hV41QEKiMlM

[3] https://en.wikipedia.org/wiki/Chinese_room

>>irrati+y71
I don't have a degree in linguistics, but I took a few classes about 15 years ago, and Chomsky's works were basically treated as gospel. Although my university's linguistics faculty included several of his former graduate students, so maybe there's a bias factor. In any case, it reminds me of an SMBC comic about how math and science advance over time [1]

[1] https://smbc-wiki.com/index.php/How-math-works

>>mef+(OP)
Chomsky's own words.

https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chat...

>>Occams+Ep1
https://davidhume.org/texts/empl1/dm

>>mef+(OP)
https://magazine.caltech.edu/post/math-language-marcolli-noa...

These days, Chomsky is working on Hopf algebras (originally from quantum physics) to explain language structure.

>>papave+591
>There was an interesting debate where Chomsky took a position on intelligence being rooted in symbolic reasoning and Asimov asserted a statistical foundation (ah, that was not intentional ;).

Chomsky vs Norvig

https://norvig.com/chomsky.html

>>to-too+Mj
He has also made significant contributions to the denial of the Khmer Rouge genocide and countless other atrocities committed by communist regimes across the world. Almost everything he's written on linguistics has been peer-reviewed, while almost none of his political work has undergone the same scrutiny before publication, and it shows.

  Noam Chomsky, the man who has spent years analyzing propaganda, is himself a propagandist. Whatever one thinks of Chomsky in general, whatever one thinks of his theories of media manipulation and the mechanisms of state power, Chomsky's work with regard to Cambodia has been marred by omissions, dubious statistics, and, in some cases, outright misrepresentations. On top of this, Chomsky continues to deny that he was wrong about Cambodia. He responds to criticisms by misrepresenting his own positions, misrepresenting his critics' positions, and describing his detractors as morally lower than "neo-Nazis and neo-Stalinists."(2) Consequently, his refusal to reconsider his words has led to continued misinterpretations of what really happened in Cambodia.

  /---/

  Chomsky often describes the Western media as propaganda. Yet Chomsky himself is no more objective than the media he criticizes; he merely gives us different propaganda. Chomsky's supporters frequently point out that he is trying to present the side of the story that is less often seen. But there is no guarantee that these "opposing" viewpoints have any factual merit; Porter and Hildebrand's book is a fine example. The value of a theory lies in how it relates to the truth, not in how it relates to other theories. By habitually parroting only the contrarian view, Chomsky creates a skewed, inaccurate version of events. This is a fundamentally flawed approach: It is an approach that is concerned with persuasiveness, and not with the truth. It's the tactic of a lawyer, not a scientist. Chomsky seems to be saying: if the media is wrong, I'll present a view which is diametrically opposed. Imagine a mathematician adopting Chomsky's method: Rather than insuring the accuracy of the calculations, problems would be "solved" by averaging different wrong answers.

https://www.mekong.net/cambodia/chomsky.htm

>>sireat+mo1
Phantoms in the Brain [1] has fantastic examples of the type of scenarios you described.

[1] - https://www.goodreads.com/book/show/31555.Phantoms_in_the_Br...

>>lovepa+v61
Neuroscientist here:

> Perhaps there are no simple and beautiful natural laws, like those that exists in Physics, that can explain how humans think and make decisions...Perhaps it's all just emergent properties of some messy evolved substrate.

Yeah, it is very likely that there are not laws that will do this, it's the substrate. The fruit fly brain (let alone human) has been mapped, and we've figured out that it's not just the synapse count, but the 'weights' that matter too [0]. Mind you, those weights adjust in real time when a living animal is out there.

You'll see in literature that there are people with some 'lucky' form of hydranencephaly where their brain is as thin as paper. But they vote, get married, have kids, and for some strange reason seem to work in mailrooms (not a joke). So we know it's something about the connectome that's the 'magic' of a human.

My pet theory: We need memristors [2] to better represent things. But that takes redesigning the computer from the metal on up, so is unlikely to occur any time soon with this current AI craze.

> The big lesson from the AI development in the last 10 years from me has been "I guess humans really aren't so special after all" which is similar to what we've been through with Physics.

Yeah, biologists get there too, just the other way abouts, with animals and humans. Like, dogs make vitamin C internally, and humans have that gene too, it's just dormant, ready for evolution (or genetic engineering) to reactivate. That said, these neuroscience issues with us and the other great apes are somewhat large and strange. I'm not big into that literature, but from what little I know, the exact mechanisms and processes that get you from tool using ourangs to tool using humans, well, those seem to be a bit strange and harder to grasp for us. Again, not in that field though.

In the end though, humans are special. We're the only ones on the planet that ever really asked a question. There's a lot to us and we're actually pretty strange in the end. There's many centuries of work to do with biology, we're just at the wading stage of that ocean.

[0] https://en.wikipedia.org/wiki/Drosophila_connectome

[1] https://en.wikipedia.org/wiki/Hydranencephaly

[2] https://en.wikipedia.org/wiki/Memristor

>>xron+IT1
I'm not that convinced by this paper. The "impossible languages" are all English with some sort of transformation applied, such as shuffling the word order. It seems like learning such languages would require first learning English and then learning the transformation. It's not surprising that systems would be worse at learning such languages than just learning English on its own. But I don't think these sorts of languages are what Chomsky is talking about. When Chomsky says "impossible languages," he means languages that have a coherent and learnable structure but which aren't compatible with what he thinks are innate grammatical facilities of the human mind. So for instance, x86 assembly language is reasonably structured and can express anything that C++ can, but unlike C++, it doesn't have a recursive tree-based syntax. Chomsky believes that any natural language you find will be structured more like C++ than like assembly language, because he thinks humans have an innate mental facility for using tree-based languages. I actually think a better test of whether LLMs learn languages like humans would be to see if they learn assembly as well as C++. That would be incomplete of course, but it would be getting at what Chomsky's talking about.

Also, GPT-2 actually seems to do quite well on some of the tested languages, including word-hop, partial reverse, and local-shuffle. It doesn't do quite as well as plain English, but GPT-2 was designed to learn English, so it's not surprising that it would do a little better. For instance, they tokenization seems biased towards English. They show "bookshelf" becoming the tokens "book", "sh", and "lf" – which in many of the languages get spread throughout a sentence. I don't think a system designed to learn shuffled-English would tokenize this way!

https://aclanthology.org/2024.acl-long.787.pdf

>>atdt+SZ
What exactly do you mean, "analogous to our own" and, "in a deep way" without making an appeal to magic or non-yet discovered fields of science? I understand what you're saying but when you scrutinize these things you end up in a place that's less scientific than one might think. That kind of seems to be one of Chomsky's salient points; we really, really need to get a handle on when we're doing science in the contemporary Kuhnian sense and philosophy.

The AI works on English, C++, Smalltalk, Klingon, nonsense, and gibberish. Like Turing's paper this illustrates the difference between, "machines being able to think" and, "machines being able to demonstrate some well understood mathematical process like pattern matching."

https://en.wikipedia.org/wiki/Computing_Machinery_and_Intell...

zlacker

Chomsky on what ChatGPT is good for (2023)