Stargate Project: SoftBank, OpenAI, Oracle, MGX to build data centers

>>tedsan+(OP)
Any clues to how they plan to invest $500 billion dollars? What infrastructure are they planning that will cost that much?

>>non-+Q1
Reasonably speaking, there is no way they can know how they plan to invest $500 billion dollars. The current generation of large language models basically use all human text thats ever been created for the parameters... not really sure where you go after than using the same tech.

>>jppope+C7
It seems to me you could generate a lot of fresh information from running every youtube video, every hour of TV on archive.org, every movie on the pirate bay -- do scene by scene image captioning + high quality whisper transcriptions (not whatever junk auto-transcription YouTube has applied), and use that to produce screenplays of everything anyone has ever seen.

I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.

>>jazzyj+qa
> a lot of fresh information from running every youtube video

EVERY youtube video?? Even the 9/11 truther videos? Sandy Hook conspiracy videos? Flat earth? Even the blatantly racist? This would be some bad training data without some pruning.

>>milton+jd
The best videos would be those where you accidentally start recording and you get 2 hours of naturalistic conversation between real people in reality. Not sure how often they are uploaded to YouTube.

Part of the reason that kids need less material is that the aren't just listening, they are also able to do experiments to see what works and what doesn't.

zlacker