zlacker

[parent] [thread] 6 comments
1. chaxor+(OP)[view] [source] 2023-05-16 14:58:48
Of course it predict the next token. Every single person on earth knows that so it's not worth repeating at all.

As for the fact that it gets things wrong sometimes - sure, this doesn't say it actually learned every algorithm (in whichever model you may be thinking about). But the nice thing is that we now have this proof via category theory, and it allows us to both frame and understand what has occurred, and to consider how to align the systems to learn algorithms better.

replies(2): >>rdedev+82 >>glitch+l3
2. rdedev+82[view] [source] 2023-05-16 15:09:46
>>chaxor+(OP)
The fact that it sometimes fails simple algorithms for large numbers but shows good performance in other complex algorithms with simple inputs seems to me that something on a fundamental level is still insufficient
replies(2): >>zamnos+Y7 >>starlu+3l
3. glitch+l3[view] [source] 2023-05-16 15:14:51
>>chaxor+(OP)
> Of course it predict the next token. Every single person on earth knows that so it's not worth repeating at all

What's a token?

replies(1): >>visarg+hb
◧◩
4. zamnos+Y7[view] [source] [discussion] 2023-05-16 15:34:32
>>rdedev+82
Insufficient for what? Humans regularly fail simple algorithms for small numbers, nevermind large numbers and complex algorithms
◧◩
5. visarg+hb[view] [source] [discussion] 2023-05-16 15:46:33
>>glitch+l3
A token is either a common word or a common enough word fragment. Rare words are expressed as multiple tokens, while frequent words as a single token. They form a vocabulary of 50k up to 250k. It is possible to write any word or text in a combination of tokens. In the worst case 1 token can be 1 char, say, when encoding a random sequence.

Tokens exist because transformers don't work on bytes or words. This is because it would be too slow (bytes), the vocabulary too large (words), and some words would appear too rarely or never. The token system allows a small set of symbols to encode any input. On average you can approximate 1 token = 1 word, or 1 token = 4 chars.

So tokens are the data type of input and output, and the unit of measure for billing and context size for LLMs.

◧◩
6. starlu+3l[view] [source] [discussion] 2023-05-16 16:23:34
>>rdedev+82
You're focusing too much on what the LLM can handle internally. No LLMs aren't good at math, but they understand mathematic concepts and can use a program or tool to perform calculations.

Your argument is the equivalent of saying humans can't do math because they rely on calculators.

In the end what matters is whether the problem is solved, not how it is solved.

(assuming that the how has reasonable costs)

replies(1): >>ipaddr+mW
◧◩◪
7. ipaddr+mW[view] [source] [discussion] 2023-05-16 19:19:23
>>starlu+3l
Humans are calculators
[go to top]