A non-anthropomorphized view of LLMs

>>zdw+(OP)
I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.

The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.

We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.

>>Al-Khw+uK
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.

>>grey-a+cL
I thought this too but then began to think about it from the perspective of the programmers trying to make it imitate human learning. That's what a nn is trying to do at the end of the day, and in the same way I train myself by reading problems and solutions, or learning vocab at a young age, it does so by tuning billions of parameters.

I think these models do learn similarly. What does it even mean to reason? Your brain knows certain things so it comes to certain conclusions, but it only knows those things because it was ''trained'' on those things.

I reason my car will crash if I go 120 mph on the other side of the road because previously I have 'seen' where the input is a car going 120mph has a high probability of producing a crash, and similarly have seen input where the car is going on the other side of the road, producing a crash. Combining the two would tell me it's a high probability.

zlacker