zlacker

> People "feel" like an LLM understands them but it doesn't, its just guessing the next token. You could say all our brains are doing are just guessing the next token but thats a little too deep for this morning

"Just" guessing the next token requires understanding. The fact that LLMs are able to respond so intelligently to such a wide range of novel prompts means that they have a very effective internal representation of the outside world. That's what we colloquially call "understanding."

replies(5): >>cmiles+S >>carom+I1 >>gortok+Y1 >>woodru+F2 >>8crazy+MX

>>Diogen+(OP)
I would argue it's an understanding of the relationship between the words; an effective internal representation of those relationships. IMHO, it's still quite a ways from a representation of the outside world.

>>Diogen+(OP)
I disagree. I use ChatGPT daily as a replacement for Google. It doesn't understand or have logic, it can spit out information very well though. It has a broad knowledge base. There is no entity there to have an understanding of the topic.

This becomes pretty clear when you get to more complex algorithms or low level details like drawing a stack frame. There is not logic there.

replies(2): >>Diogen+T8 >>root_a+Fg

>>Diogen+(OP)
it requires calculation of frequency of how often words appear next to each other given other surrounding words. If you want to call that 'understanding', you can, but it's not semantic understanding.

If it were, these LLMs wouldn't hallucinate so much.

Semantic understanding is still a ways off, and requires much more intelligence than we can give machines at this moment. Right now the machines are really good at frequency analysis, and in our fervor we mistake that for intelligence.

replies(1): >>Diogen+v8

>>Diogen+(OP)
To my understanding (ha!), none of these language models have demonstrated the "recursive" ability that's basic to human consciousness and language: they've managed to iteratively refine their internal world model, but that model implodes as the user performs recursive constructions.

This results in the appearance of an arms race between world model refinement and user cleverness, but it's really a fundamental expressive limitation: the user can always recurse, but the model can only predict tokens.

(There are a lot of contexts in which this distinction doesn't matter, but I would argue that it does matter for a meaningful definition of human-like understanding.)

replies(1): >>johnth+hT

>>gortok+Y1
> it requires calculation of frequency of how often words appear next to each other given other surrounding words

In order to do that effectively, you have to have very significant understanding of the world. The texts that LLMs are learning from describe a wide range of human knowledge, and if you want to accurately predict what words will appear where, you have to build an internal representation of that knowledge.

ChatGPT knows who Henry VIII was, who his wives were, the reasons he divorced/offed them, what a divorce is, what a king is, that England has kings, etc.

> If it were, these LLMs wouldn't hallucinate so much.

I don't see how this follows. First, humans hallucinate. Second, why does hallucination prove that LLMs don't understand anything? To me, it just means that they are trained to answer, and if they don't know the answer, they BS it.

>>carom+I1
> It doesn't understand or have logic

I can ask ChatGPT questions that require logic to answer, and it will do just fine in most cases. It has certain limitations, but to say it isn't able to apply logic is just completely contrary to my experience with ChatGPT.

replies(1): >>jobs_t+nf

>>Diogen+T8
give us an example please

replies(1): >>Diogen+sV

>>carom+I1
Indeed. It's also obvious when the "hallucinations" create contradictory responses that a conceptual understanding would always preclude. For example, "In a vacuum, 100g of feathers and 100g of iron would fall at the same rate due to the constant force of gravity, thus the iron would hit the ground first". Only a language model makes this type of mistake because its output is statistical, not conceptual.

>>woodru+F2
Supposedly that was Q* all about. Search recursively, backtrack if dead end. who knows really, but the technology is still very new, I personally don't see why a sufficiently good world model can't be used in this manner.

>>jobs_t+nf
I deliberately asked ChatGPT a logical question with a false premise: "If all snakes have legs, and a python is a snake, does a python have legs?"

ChatGPT answers:

> Yes, if we assume the statement "all snakes have legs" to be true and accept that a python is a type of snake, then logically, a python would have legs. This conclusion follows from the structure of a logical syllogism:

> 1. All snakes have legs.

> 2. A python is a snake.

> 3. Therefore, a python has legs.

> However, it’s important to note that in reality, snakes, including pythons, do not have legs. This logical exercise is based on the hypothetical premise that all snakes have legs.

ChatGPT clearly understands the logic of the question, answers correctly, and then tells me that the premise of my question is incorrect.

You can say, "But it doesn't really understand logic. It's just predicting the most likely token." Well, it responds exactly how someone who understands logic would respond. If you assert that that's not the same as applying logic, then I think you're essentially making a religious statement.

replies(1): >>root_a+Jk1

>>Diogen+(OP)
I've seen this idea that "LLMs are just guessing the next token" repeated everywhere. It is true that accuracy in that task is what the training algorithms aim at. That is not however, what the output of the model represents in use, in my opinion. I suspect the process is better understood as predicting the next concept, not the next token. As the procedure passes from one level to the next, this concept morphs from a simple token to an ever more abstract representation of an idea. That representation (and all the others being created elsewhere from the text) interact to form the next, even more abstract concept. In this way ideas "close" to each other become combined and can fuse into each other, until an "intelligent" final output is generated. It is true that the present configuration doesn't offer the LLM a very good way to look back to see what its output has been doing, and I suspect that kind of feedback will be necessary for big improvements in performance. Clearly, there is an integration of information occurring, and it is interesting to contemplate how that plays into G. Tononi's definition of consciousness in his "information integration theory".

replies(1): >>8crazy+8r2

>>Diogen+sV
> Well, it responds exactly how someone who understands logic would respond.

An animation looks exactly like something in motion looks, but it isn't actually moving.

replies(1): >>Diogen+iG1

>>root_a+Jk1
What's the difference between responding logically and giving answers that are identical to how one would answer if one were to apply logic?

replies(1): >>carom+Nc6

>>8crazy+MX
Also, as far as hallucinations go, no symbolic representation of a set of concepts can distinguish reality from fantasy. Disconnect a human from their senses and they will hallucinate too. For progress in this, the LLM will have to be connected in some way to the reality of the world, like our senses and physical body connect us. Only then they can compare their "thoughts" and "beliefs" to reality. Insisting they at least check their output against facts as recorded by what we already consider reliable sources is the obvious first step. For example, I made a GPT called "Medicine in Context" to educate users; I wanted to call it "Reliable Knowledge: Medicine" because of the desperate need for ordinary people to get reliable medical information, but of course I wouldn't dare. It would be very irresponsible. It is clear that the GPT would have to be built to check every substantive fact against reality, and ideally to remember such established facts going into the future. Over time, it would accumulate true expertise.

>>Diogen+iG1
The logic does not generalize to things outside of the training set. It cannot reason about code very well, but it can write you functions with memorized docs.

replies(1): >>Diogen+Ty6

>>carom+Nc6
Unless you're saying that my exact prompt is already in ChatGPT's training set, the above is an example of successful generalization.

replies(1): >>carom+Ps8

>>Diogen+Ty6
>All Xs have Ys.

>A Z is an X.

>Therefore a Z has Ys.

I am fairly certain variations of this are in the training set. The tokens following that about "in reality Zs not having Ys" are due to X, Y, and Z being incongruous in the rest of the data.

It is not not performing a logical calculation, it is predicting the next token.

Explanations of simple logical chains are also in the training data.

Think of it instead as really good (and flexible) language templates. It can fill in the template for different things.

replies(1): >>Diogen+rM8

>>carom+Ps8
> It is not not performing a logical calculation, it is predicting the next token.

Those two things are not in any way mutually exclusive. Understanding the logic is an effective way to accurately predict the next token.

> I am fairly certain variations of this are in the training set.

Yes, which is probably how ChatGPT learned that logical principle. It has now learned to correctly apply that logical principle to novel situations. I suspect that this is very similar to how human beings learn logic as well.