https://docs.confluent.io/platform/current/kafka/deployment....
There is a point where the exponential pricing starts, but that point is way out there than most people expect. Probably ~100CPU, ~1TB RAM, >50Gbps network etc.
CPU isn't usually a problem until you start using very large compactions, and then suddenly it can be a massive bottleneck. (Actually I would love to abuse more RAM here but log.cleaner.dedupe.buffer.size has a tiny maximum value!)
Kafka Streams (specifically) is also configured to transact by default, even though most applications aren't written to be able to actually benefit from that. If you run lots of different consumer services this results in burning a lot of CPU on transactions in a "flat profile"-y way that's hard to illustrate to application developers since each consumer, individually, is relatively small - there's just thousands of them.
Amusingly, for $94K (probably more like $85K after negotiation) you can buy a white box server: Dual Epyc 9000, 96 core/192thread, 3.5GHz, w/ 3TB RAM, 240T of very fast SSD, and a 10G NIC. The minimum config, Dual Epyc 9124, 16core/32thread, 64GB RAM, and only 4TB of storage is $9K (more like $8K after negotiation). That's "only" a factor of 10 in price for 8X CPUs, 48X the RAM, and 60X the storage.
And the reason they do it that way is because it's cheaper. Because the scaling is sublinear up to a good size.