zlacker

[return to "Twitter Is DDOSing Itself"]
1. brucet+bl[view] [source] 2023-07-01 20:08:08
>>ZacnyL+(OP)
Also, taking Elon's word at face value for a second... is Twitter really worth scraping for AI training or whatever?

Its a hive of misinformation, disinformation and toxicity. Its succinct I guess, but nothing is eloquent or descriptive because of the character limit. And its full of repetitive "filler" information.

Who wants that in a foundational LLM dataset?

Maybe its OK for finding labeled images... But that still seems kidna iffy.

◧◩
2. TillE+Os[view] [source] 2023-07-01 20:48:20
>>brucet+bl
It's useful if you want your LLM to be able to generate tweet-like microblogging text. That does have some value.

Or maybe you want to get an aggregate idea of what people are currently talking about in the world, stuff that doesn't rise to the level of capital-n News. There aren't a lot of alternatives for that.

◧◩◪
3. brucet+xx[view] [source] 2023-07-01 21:15:59
>>TillE+Os
Output formatting or a quick finetune/LORA can do microblogging very easily.

Yeah, lots of general chat is unfortunately stuck in Twitter (or difficult -to-scrape siloed off platforms.

[go to top]