>>KoolKa+(OP)
I think the most providers all give high latency batch APIs significant discounts. A lot of AI workloads feel batch-oriented to me, or could be once they move beyond the prototype and testing phases. Chat will end up being a small fraction of load in the long term.