A non-anthropomorphized view of LLMs

>>zdw+(OP)
I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.

The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.

We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.

>>Al-Khw+uK
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.

>>grey-a+cL
When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.

To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.

>>fenoma+hT
Wait until a conversation about “serverless” comes up and someone says there is no such thing because there are servers somewhere as if everyone - especially on HN -doesn’t already know that.

>>scarfa+vi1
Why would everyone know that? Not everyone has experience in sysops, especially not beginners.

E.g. when I first started learning webdev, I didn’t think about ‘servers’. I just knew that if I uploaded my HTML/PHP files to my shared web host, then they appeared online.

It was only much later that I realized that shared webhosting is ‘just’ an abstraction over Linux/Apache (after all, I first had to learn about those topics).

>>Tijdre+ON1
I think they fumbled with wording but I interpreted them as meaning "audience of HN" and it seems they confirmed.

We always are speaking to our audience, right? This is also what makes more general/open discussions difficult (e.g. talking on Twitter/Facebook/etc). That there are many ways to interpret anything depending on prior knowledge, cultural biases, etc. But I think it is fair that on HN we can make an assumption that people here are tech savvy and knowledgeable. We'll definitely overstep and understep at times, but shouldn't we also cultivate a culture where it is okay to ask and okay to apologize for making too much of an assumption?

I mean at the end of the day we got to make some assumptions, right? If we assume zero operating knowledge then comments are going to get pretty massive and frankly, not be good at communicating with a niche even if better at communicating with a general audience. But should HN be a place for general people? I think no. I think it should be a place for people interested in computers and programming.

zlacker