You can have a swarm of small, disposable satellites with laser links between them.
And for data centers, the satellite wouldn't be as far apart as starlight satellites, they would be quite close instead.
And a single cluster today would already require more solar & cooling capacity than all starlink satellites combined.
I vaguely recall an article a while ago about the impact of GPU reliability: a big problem with training is that the entire cluster basically operates in lock-step, with each node needing the data its neighbors calculated during the previous step to proceed. The unfortunate side-effect is that any failure stops the entire hundred-thousand-node cluster from proceeding - as the cluster grows even the tiniest failure rate is going to absolutely ruin your uptime. I think they managed to somehow solve this, but I have absolutely no idea how they managed to do it.