zlacker

[parent] [thread] 14 comments
1. tempes+(OP)[view] [source] 2026-02-02 22:45:48
That helps with the heat from the sun problem, but not the radiation of heat from the GPUs. Those radiators would need to be unshaded by the solar panels, and would need to be enormous. Cooling stuff in atmosphere is far easier than in vacuum.
replies(2): >>bdamm+A4 >>Doctor+pn
2. bdamm+A4[view] [source] 2026-02-02 23:01:38
>>tempes+(OP)
Not so. Look at the construction of JWST. One side is "hot", the other side is very, very cold.

I am highly skeptical about data centers in space, but radiators don't need to be unshaded. In fact, they benefit from the shade. This is also being done on the ISS.

replies(3): >>RIMR+r7 >>tempes+ah >>lm2846+wn
◧◩
3. RIMR+r7[view] [source] [discussion] 2026-02-02 23:14:10
>>bdamm+A4
Look at how many layers of insulation are needed for the JWST to have a hot and cold side! Again, this is not particularly simple stuff.

The JWST operates at 2kw max. That's not enough for a single H200.

AI datacenters in space are a non-starter. Anyone arguing otherwise doesn't understand basic thermodynamics.

replies(1): >>Doctor+Mx
◧◩
4. tempes+ah[view] [source] [discussion] 2026-02-02 23:59:07
>>bdamm+A4
That's fair. I meant they would need a clear path to open space not blocked by solar panels, but yes, a hot and cold side makes sense.

The whole concept is still insane though, fwiw.

replies(1): >>Doctor+Fm
◧◩◪
5. Doctor+Fm[view] [source] [discussion] 2026-02-03 00:31:29
>>tempes+ah
"I meant they would need a clear path to open space not blocked by solar panels, but yes, a hot and cold side makes sense."

This is precisely why my didactic example above uses a convex shape, a pyramid. This guarantees each surface absorbs or radiates energy without having to take into account self-obscuring by satellite shape.

6. Doctor+pn[view] [source] 2026-02-03 00:36:00
>>tempes+(OP)
this makes no sense, the radiation of heat from the GPU's came from electrical energy, the electrical energy came from the efficient fraction of solar panel energy, the inefficient fraction being heating of the solar panel, the total amount of heat that needs to be dissipated is simply the total amount of energy incident on the solar panels.
replies(1): >>tempes+UB
◧◩
7. lm2846+wn[view] [source] [discussion] 2026-02-03 00:37:06
>>bdamm+A4
> Look at the construction of JWST.

A very high end desktop pulls more electricity than the whole JWST... Which is about the same as a hair dryer.

Now you need about 50x more for a rack and hundreds/thousands racks for a meaningful cluster. Shaded or not it's a shit load of radiators

https://azure.microsoft.com/en-us/blog/microsoft-azure-deliv...

replies(1): >>Doctor+6e1
◧◩◪
8. Doctor+Mx[view] [source] [discussion] 2026-02-03 01:45:43
>>RIMR+r7
The goal of JWST is not to consume as much power as possible, and perform useful computations with it. A system not optimized for metric B but for metric A scores bad for metric B... great observation.
◧◩
9. tempes+UB[view] [source] [discussion] 2026-02-03 02:11:25
>>Doctor+pn
True, the solar panels would need to be enormous too.
replies(1): >>Doctor+071
◧◩◪
10. Doctor+071[view] [source] [discussion] 2026-02-03 06:47:02
>>tempes+UB
Let's say we wanted to train LLaMa 3.1 405B:

[0] https://developer.nvidia.com/deep-learning-performance-train...

Click the "Large Language Model" tab next to the default "MLPerf Training" tab.

That takes 16.8 days on 128 B200 GPU's:

> Llama3 405B 16.8 days on 128x B200

A DGX B200 contains 8xB200 GPU's. So it takes 16.8 days on 16 DGX B200's.

A single DGX (8x)B200 node draws about 14.3 kW under full load.

> System Power Usage ~14.3 kW max

source [1] https://www.nvidia.com/en-gb/data-center/dgx-b200

16 x 14.3 kW = ~230 kW

at ~20% solar panel efficiency, we need 1.15 MW of optical power incident on the solar panels.

The required solar panel area becomes 1.15 * 10^6 W / 1.360 * 10^3 W / m ^ 2 = 846 m ^ 2.

thats about 30 m x 30 m.

From the center of the square solar panel array to the tip of the pyramid it would be 3x30m = 90 m.

An unprecedented feat? yes. But no physics is being violated here. The parts could be launched serially and then assembled in space. Thats a device that can pretrain from scratch LLaMa 3.1 in 16.8 days. It would have way to much memory for LLaMa 3.1: 16 x 8 x 192 GB = ~ 25 TB of GPU RAM. So this thing could pretrain much larger models, but would also train them slower than a LLaMa 3.1.

Once up there it enjoys free energy for as long as it survives, no competing on the electrical grid with normal industry, or domestic energy users, no slow cooking of the rivers and air around you, ...

replies(1): >>lm2846+Az1
◧◩◪
11. Doctor+6e1[view] [source] [discussion] 2026-02-03 07:47:23
>>lm2846+wn
addressed at >>46867402
◧◩◪◨
12. lm2846+Az1[view] [source] [discussion] 2026-02-03 10:31:08
>>Doctor+071
We're talking past each other I think. In theory we can cool down anything we want, that's not the problem. 8 DGX B200 isn't a datacenter, and certainly not anywhere close to the figures discussed (500-1000tw of ai satellites per year)

Nobody said sending a single rack and cooling it is technically impossible. We're saying sending datacenters worth of rack is insanely complex and most likely not financially viable nor currently possible.

Microsoft just built a datacenter with 4600 racks of GB300, that's 4600 * 1.5t, that alone weights more than everything we sent into orbit in 2025, and that's without power nor cooling. And we're still far from a single terawatt.

replies(1): >>Doctor+bB1
◧◩◪◨⬒
13. Doctor+bB1[view] [source] [discussion] 2026-02-03 10:46:00
>>lm2846+Az1
it is instructive to calculate the size and requirements for a system that can pretrain a 405B parameter transformer in ~ 17 days.

a different question is the expected payback time, unless someone can demonstrate a reasonable calculation that shows a sufficiently short payback period, if no one here can we still can't exclude big tech seeing something we don't have access to (the launch costs charged to third parties may be different than the launch costs charged for themselves for example).

suppose the payback time is in fact sufficiently short or commercial life sufficiently long to make sense, then the scale didn't really matter, it just means sending up the system described above repeatedly.

replies(1): >>lm2846+GO1
◧◩◪◨⬒⬓
14. lm2846+GO1[view] [source] [discussion] 2026-02-03 12:22:58
>>Doctor+bB1
I mean yeah if you consider the "scale" to not be a problem there are no problems indeed. I argue that the scale actually is the biggest problem here... which is the case with most of our issues (energy, pollution, cooling, heating, &c.)
replies(1): >>Doctor+FZ1
◧◩◪◨⬒⬓⬔
15. Doctor+FZ1[view] [source] [discussion] 2026-02-03 13:35:23
>>lm2846+GO1
The real question is not scale, but if it makes financial sense, I don't have sufficient insight into the answer to that question.

Either it does or it doesn't make financial sense, and if it does the scale isn't the issue (well until we run into material shortages building Elon's Dyson sphere, hah).

[go to top]