zlacker

[return to "Twitter Is DDOSing Itself"]
1. brucet+bl[view] [source] 2023-07-01 20:08:08
>>ZacnyL+(OP)
Also, taking Elon's word at face value for a second... is Twitter really worth scraping for AI training or whatever?

Its a hive of misinformation, disinformation and toxicity. Its succinct I guess, but nothing is eloquent or descriptive because of the character limit. And its full of repetitive "filler" information.

Who wants that in a foundational LLM dataset?

Maybe its OK for finding labeled images... But that still seems kidna iffy.

◧◩
2. muixoo+gG[view] [source] 2023-07-01 22:12:45
>>brucet+bl
I once got paid $20 as an undergrad to go through hundreds of thousands of tweets and convert slang into plain english for training data. The only thing I took away from the experience, aside from finally getting good with vim macros, is the average tweet is really low effort an uninteresting. I don't recall reading a single thing that I would imagine someone retweeting (think that's what it's called). Maybe I was given only replies. Anyway, not sure if there's value there for LLMs, but I'd be skeptical.
[go to top]