zlacker

[parent] [thread] 0 comments
1. soulof+(OP)[view] [source] 2024-10-19 23:49:27
I think one big problem is that people understand LLMs as text-generation models, when really they're just sequence prediction models, which is a highly versatile, but data-hungry, architecture for encoding relationships and knowledge. LLMs are tuned for text input and output, but they just work on numbers and the general transformer architecture is highly generalizable.
[go to top]