> building new AI infrastructure for OpenAI in the United States
The carrot is probably something like - we will build enough compute to make a supper intelligence that will solve all the problems, ???, profit.
I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.
If it's possible to produce intelligence from just ingesting text, then current tech companies have all the data they need from their initial scrapes of the internet. They don't need more. That's different to keeping models up to date on current affairs.
EVERY youtube video?? Even the 9/11 truther videos? Sandy Hook conspiracy videos? Flat earth? Even the blatantly racist? This would be some bad training data without some pruning.
> Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.
Thermodynamic neural networks may also basically turn everything on its ear, especially if we figure out how to scale them like NAND flash.
If anything, I would estimate that this is a space-race type effort to “win” the AI “wars”. In the short term, it might work. In the long term, it’s probably going to result in a massive glut in accelerated data center capacity.
The trend of technology is towards doing better than natural processes, not doing it 100000x less efficiently. I don’t think AI will be an exception.
If we look at what is -theoretically- possible using thermodynamic wells, with current model architectures, for instance, we could (theoretically) make a network that applies 1t parameters in something like 1cm2. It would use about 20watts, back of the napkin, and be able to generate a few thousand T/S.
Operational thermodynamic wells have already been demonstrated en silica. There are scaling challenges, cooling requirements, etc but AFAIK no theoretical roadblocks to scaling.
Obviously, the theoretical doesn’t translate to results, but it does correlate strongly with the trend.
So the real question is, what can we build that can only be done if there are hundreds of millions of NVIDIA GPUs sitting around idle in ten years? Or alternatively, if those systems are depreciated and available on secondary markets?
What does that look like?
Part of the reason that kids need less material is that the aren't just listening, they are also able to do experiments to see what works and what doesn't.
but also myriad of hardcore private repositories of many high-tech US enterprises hacking amazing shit (mine included) :)
Extropic update on building the ultimate substrate for generative AI https://twitter.com/Extropic_AI/status/1820577538529525977