zlacker

[parent] [thread] 2 comments
1. IanCal+(OP)[view] [source] 2023-11-21 07:46:13
You're mixing up what we mean by what rules it's following or how it's working.

If I ask how it's able to write a poem given a request and you tell me you know - it multiplies and adds this set of 1.8 trillion numbers together X times with this set of accumulators, I would argue you don't understand how it works enough to make any useful predictions.

Kind of like how you understand what insane spaghetti code is doing - it's running this code - but can have absolutely no idea what business logic it encodes.

replies(1): >>galaxy+1s1
2. galaxy+1s1[view] [source] 2023-11-21 16:50:34
>>IanCal+(OP)
It is not "spaghetti-code" but well-engineered code I believe. The output of an LLM is based on billions of fine-tuned parameters but we know how those parameters came about, by executing the code of the AI-application in the training mode.

It doesn't really encode "business logic", it just matches your input with the best output it can come up with, based on how its parameters are fine-tuned. Saying that "We don't understand how it works" is just unnecessary AI-mysticism.

replies(1): >>IanCal+cB1
◧◩
3. IanCal+cB1[view] [source] [discussion] 2023-11-21 17:22:40
>>galaxy+1s1
The spaghetti code comparison is not to the code but the parameters.

> It doesn't really encode "business logic"

Doesn't it? Gpt architectures can build world models internally while processing tokens (see Othello got).

> we know how those parameters came about, by executing the code of the AI-application in the training mode.

Sure. But that's not actually a very useful description when trying to figure out how to use and apply these models to solve problems or understand what their limitations are.

> Saying that "We don't understand how it works" is just unnecessary AI-mysticism.

We don't to the level we want to.

Tell you what, let's flip it around. If we know how they work just fine, why are smart researchers doing experiments with them? Why is looking at the code and billions or trillions of floats not enough?

[go to top]