zlacker

[parent] [thread] 1 comments
1. glitch+(OP)[view] [source] 2023-05-16 15:14:51
> Of course it predict the next token. Every single person on earth knows that so it's not worth repeating at all

What's a token?

replies(1): >>visarg+W7
2. visarg+W7[view] [source] 2023-05-16 15:46:33
>>glitch+(OP)
A token is either a common word or a common enough word fragment. Rare words are expressed as multiple tokens, while frequent words as a single token. They form a vocabulary of 50k up to 250k. It is possible to write any word or text in a combination of tokens. In the worst case 1 token can be 1 char, say, when encoding a random sequence.

Tokens exist because transformers don't work on bytes or words. This is because it would be too slow (bytes), the vocabulary too large (words), and some words would appear too rarely or never. The token system allows a small set of symbols to encode any input. On average you can approximate 1 token = 1 word, or 1 token = 4 chars.

So tokens are the data type of input and output, and the unit of measure for billing and context size for LLMs.

[go to top]