zlacker

I kinda agree with both of you. It might be a required abstraction, but it's a leaky one.

Long before LLMs, I would talk about classes / functions / modules like "it then does this, decides the epsilon is too low, chops it up and adds it to the list".

The difference I guess it was only to a technical crowd and nobody would mistake this for anything it wasn't. Everybody know that "it" didn't "decide" anything.

With AI being so mainstream and the math being much more elusive than a simple if..then I guess it's just too easy to take this simple speaking convention at face value.

EDIT: some clarifications / wording

replies(4): >>flir+19 >>loxs+ri >>stoney+Yq >>HelloU+cw

>>cmenge+(OP)
Agreeing with you, this is a "can a submarine swim" problem IMO. We need a new word for what LLMs are doing. Calling it "thinking" is stretching the word to breaking point, but "selecting the next word based on a complex statistical model" doesn't begin to capture what they're capable of.

Maybe it's cog-nition (emphasis on the cog).

replies(9): >>whilen+s9 >>Leonar+Sf >>psycho+Wn >>JimDab+1r >>intend+AA >>Atlas6+bT >>delusi+V21 >>ryeats+cU1 >>seanhu+Ex2

>>flir+19
"predirence" -> prediction meets inference and it sounds a bit like preference

replies(1): >>psycho+xo

>>flir+19
What does a submarine do? Submarine? I suppose you "drive" a submarine which is getting to the idea: submarines don't swim because ultimately they are "driven"? I guess the issue is we don't make up a new word for what submarines do, we just don't use human words.

I think the above poster gets a little distracted by suggesting the models are creative which itself is disputed. Perhaps a better term, like above, would be to just use "model". They are models after all. We don't make up a new portmanteau for submarines. They float, or drive, or submarine around.

So maybe an LLM doesn't "write" a poem, but instead "models a poem" which maybe indeed take away a little of the sketchy magic and fake humanness they tend to be imbued with.

replies(6): >>Feepin+hj >>flir+Ir >>irthom+FM >>thinkm+hP >>jorvi+I01 >>j0057+Wj1

>>cmenge+(OP)
We can argue all day what "think" means and whether a LLM thinks (probably not IMO), but at least in my head the threshold for "decide" is much lower so I can perfectly accept that a LLM (or even a class) "decides". I don't have a conflict about that. Yeah, it might not be a decision in the human sense, but it's a decision in the mathematical sense so I have always meant "decide" literally when I was talking about a piece of code.

It's much more interesting when we are talking about... say... an ant... Does it "decide"? That I have no idea as it's probably somewhere in between, neither a sentient decision, nor a mathematical one.

replies(1): >>0x457+3Q1

>>Leonar+Sf
Humans certainly model inputs. This is just using an awkward word and then making a point that it feels awkward.

>>flir+19
It does some kind of automatic inference (AI), and that's it.

>>whilen+s9
Except -ence is a regular morph, and you would rather suffix it to predict(at)-.

And prediction is already an hyponym of inference. Why not just use inference then?

replies(1): >>whilen+Aw

>>cmenge+(OP)
I mean you can boil anything down to it's building blocks and make it seem like it didn't 'decide' anything. When you as a human decide something, your brain and it's neurons just made some connections with an output signal sent to other parts that resulting in your body 'doing' something.

I don't think LLMs are sentient or any bullshit like that, but I do think people are too quick to write them off before really thinking about how a nn 'knows things' similar to how a human 'knows' things, it is trained and reacts to inputs and outputs. The body is just far more complex.

replies(1): >>grey-a+tx

>>flir+19
> this is a "can a submarine swim" problem IMO. We need a new word for what LLMs are doing.

Why?

A plane is not a fly and does not stay aloft like a fly, yet we describe what it does as flying despite the fact that it does not flap its wings. What are the downsides we encounter that are caused by using the word “fly” to describe a plane travelling through the air?

replies(4): >>flir+ir >>dotanc+MQ >>Tijdre+nY >>lelant+6Y1

>>JimDab+1r
I was riffing on that famous Dijkstra quote.

>>Leonar+Sf
I really like that, I think it has the right amount of distance. They don't write, they model writing.

We're very used to "all models are wrong, some are useful", "the map is not the territory", etc.

replies(2): >>galang+Au >>seyebe+V83

>>flir+Ir
No one was as bothered when we anthropomorphized crud apps simply for the purpose of conversing about "them". "Ack! The thing is corrupting tables again because it thinks we are still using api v3! Who approved that last MR?!" The fact that people are bothered by the same language now is indicative in itself. If you want to maintain distance, pre prompt models to structure all conversations to lack pronouns as between a non sentient language model and a non sentient agi. You can have the model call you out for referring to the model as existing. The language style that forces is interesting, and potentially more productive except that there are fewer conversations formed like that in the training dataset. Translation being a core function of language models makes it less important thought. As for confusing the map for the territory, that is precisely what philosophers like Metzinger say humans are doing by considering "self" to be a real thing and that they are conscious when they are just using the reasoning shortcut of narrating the meta model to be the model.

replies(1): >>flir+ZF

>>cmenge+(OP)
> EDIT: some clarifications / wording

This made me think, when will we see LLMs do the same; rereading what they just sent, and editing and correcting their output again :P

>>psycho+xo
I didn't think of prediction in the statistical sense here, but rather as a prophecy based on a vision, something that is inherently stored in a model without the knowledge of the modelers. I don't want to imply any magic or something supernatural here, it's just the juice that goes off the rails sometimes, and it gets overlooked due to the sheer quantity of the weights. Something like unknown bugs in production, but, because they still just represent a valid number in some computation that wouldn't cause any panic, these few bits can show a useful pattern under the right circumstances.

Inference would be the part that is deliberately learned and drawn from conclusions based on the training set, like in the "classic" sense of statistical learning.

>>stoney+Yq
I wasn't talking about knowing (they clearly encode knowledge), I was talking about thinking/reasoning, which is something LLMs do not in fact do IMO.

These are very different and knowledge is not intelligence.

replies(1): >>chpatr+W32

>>flir+19
It will help significantly, to realize that the only thinking happening is when the human looks at the output and attempts to verify if it is congruent with reality.

The rest of the time it’s generating content.

>>galang+Au
> You can have the model call you out for referring to the model as existing.

This tickled me. "There ain't nobody here but us chickens".

I have other thoughts which are not quite crystalized, but I think UX might be having an outsized effect here.

replies(1): >>galang+4T

>>Leonar+Sf
Depends on if you are talking about an llm or to the llm. Talking to the llm, it would not understand that "model a poem" means to write a poem. Well, it will probably guess right in this case, but if you go out of band too much it won't understand you. The hard problem today is rewriting out of band tasks to be in band, and that requires anthropomorphizing.

replies(1): >>dcooki+Hz1

>>Leonar+Sf
GenAI _generates_ output

>>JimDab+1r
For what it's worth, in my language the motion of birds and the motion of aircraft _are_ two different words.

>>flir+ZF
In addition to he/she etc. there is a need for a button for no pronouns. "Stop confusing metacognition for conscious experience or qualia!" doesn't fit well. The UX for these models is extremely malleable. The responses are misleading mostly to the extent the prompts were already misled. The sorts of responses that arise from ignorant prompts are those found within the training data in the context of ignorant questions. This tends to make them ignorant as well. There are absolutely stupid questions.

>>flir+19
A machine that can imitate the products of thought is not the same as thinking.

All imitations require analogous mechanisms, but that is the extent of their similarities, in syntax. Thinking requires networks of billions of neurons, and then, not only that, but words can never exist on a plane because they do not belong to a plane. Words can only be stored on a plane, they are not useful on a plane.

Because of this LLMs have the potential to discover new aspects and implications of language that will be rarely useful to us because language is not useful within a computer, it is useful in the world.

Its like seeing loosely related patterns in a picture and keep derivating on those patterns that are real, but loosely related.

LLMs are not intelligence but its fine that we use that word to describe them.

>>JimDab+1r
Flying isn’t named after flies, they both come from the same root.

https://www.etymonline.com/search?q=fly

>>Leonar+Sf
A submarine is propelled by a propellor and helmed by a controller (usually a human).

It would be swimming if it was propelled by drag (well, technically a propellor also uses drag via thrust, but you get the point). Imagine a submarine with a fish tail.

Likewise we can probably find an apt description in our current vocabulary to fittingly describe what LLMs do.

>>flir+19
> "selecting the next word based on a complex statistical model" doesn't begin to capture what they're capable of.

I personally find that description perfect. If you want it shorter you could say that an LLM generates.

>>Leonar+Sf
A submarine is a boat and boats sail.

replies(2): >>TimThe+Do1 >>floam+yh2

>>j0057+Wj1
An LLM is a stochastic generative model and stochastic generative models ... generate?

replies(1): >>Leonar+yx1

>>TimThe+Do1
And we are there. A boat sails, and a submarine sails. A model generates makes perfect sense to me. And saying chatgpt generated a poem feels correct personally. Indeed a model (e.g. a linear regression) generates predictions for the most part.

>>irthom+FM
> it won't understand you

Oops.

replies(1): >>irthom+rD1

>>dcooki+Hz1
That's consistent with my distinction when talking about them vs too them.

>>loxs+ri
Well, it outputs a chain of thoughts that later used to produce better prediction. It produces a chain of thoughts similar to how one would do thinking about a problem out loud. It's more verbose that what you would do, but you always have some ambient context that LLM lacks.

>>flir+19
It's more like muscle memory than cognition. So maybe procedural memory but that isn't catchy.

replies(1): >>01HNNW+LZ1

>>JimDab+1r
> A plane is not a fly and does not stay aloft like a fly, yet we describe what it does as flying despite the fact that it does not flap its wings.

Flying doesn't mean flapping, and the word has a long history of being used to describe inanimate objects moving through the air.

"A rock flies through the window, shattering it and spilling shards everywhere" - see?

OTOH, we have never used to word "swim" in the same way - "The rock hit the surface and swam to the bottom" is wrong!

>>ryeats+cU1
They certainly do act like a thing which has a very strong "System 1" but no "System 2" (per Thinking, Fast And Slow)

>>grey-a+tx
To me all of those are so vaguely defined that arguing whether an LLM is "really really" doing something is kind of a waste of time.

It's like we're clinging on to things that make us feel like human cognition is special so we're saying LLM's arent "really" doing it, then not defining what it actually is.

>>j0057+Wj1
Submarines dive.

>>flir+19
This is a total non-problem that has been invented by people so they have something new and exciting to be pedantic about.

When we need to speak precisely about a model and how it works, we have a formal language (mathematics) which allows us to be absolutely specific. When we need to empirically observe how the model behaves, we have a completely precise method of doing this (running an eval).

Any other time, we use language in a purposefully intuitive and imprecise way, and that is a deliberate tradeoff which sacrifices precision for expressiveness.

>>flir+Ir
What about they synthesize?

Ties in with creation from many and synthetic/artificial data. I usually prompt instruct my coding models more with “synthesize” than “generate”.