I don't know how to make sense of this level of investment. I feel that I lack the proper conceptual framework to make sense of the purchasing power of half a trillion USD in this context.
This sort of $100-500B budget doesn't sound like training cluster money, more like anticipating massive industry uptake and multiple datacenters running inference (with all of corporate America's data sitting in the cloud).
I've read that some datacenters run mixed generation GPUs - just updating some at a time, but not sure if they all do that.
It'd be interesting to read something about how updates are typically managed/scheduled.