zlacker

The plan seems to be for lots and lots of smaller satellites.

For inferencing it can work well. One satellite could contain a handful of CPUs and do batch inferencing of even very large models, perhaps in the beginning at low speeds. Currently most AI workloads are interactive but I can't see that staying true for long, as things improve and they can be trusted to work independently for longer it makes more sense to just queue stuff up and not worry about exactly how high your TTFT is.

For training I don't see it today. In future maybe. But then, most AI workloads in future should be inferencing not training anyway.

replies(1): >>KoolKa+Ej1

>>mike_h+(OP)
Latency means this still makes no sense to me. Perhaps some batch background processing job such as research or something but that's stretching.

replies(1): >>mike_h+OQ3

>>KoolKa+Ej1
I think the most providers all give high latency batch APIs significant discounts. A lot of AI workloads feel batch-oriented to me, or could be once they move beyond the prototype and testing phases. Chat will end up being a small fraction of load in the long term.

replies(1): >>KoolKa+Zs6

>>mike_h+OQ3
That would imply there's still capacity here on earth for this type of traffic.