Nvidia H100 GPUs: Supply and Demand

>>tin7in+(OP)
The real gut-punch for this is a reminder how far behind most engineers are in this race. With web 1.0 and web 2.0 at least you could rent a cheap VPS for $10/month and try out some stuff. There is almost no universe where a couple of guys in their garage are getting access to 1000+ H100s with a capital cost in the multiple millions. Even renting at that scale is $4k/hour. That is going to add up quickly.

I hope we find a path to at least fine-tuning medium sized models for prices that aren't outrageous. Even the tiny corp's tinybox [1] is $15k and I don't know how much actual work one could get done on it.

If the majority of startups are just "wrappers around OpenAI (et al.)" the reason is pretty obvious.

1. https://tinygrad.org/

>>zoogen+Lo1
> I hope we find a path to at least fine-tuning medium sized models for prices that aren't outrageous

It's not that bad; there are lots of things you can do with a hobbyist budget. For example, a consumer GPU with 12 or 24 GB VRAM costs $1000-2000 and can let you run many models and do fine-tuning on them. The next step up, for fine-tuning larger models, is to rent an instance on vast.ai or something similar for a few hours with a 4-8 GPU instance, which will set you back maybe $200—still within the range of a hobbyist budget. Many academic fine-tuning efforts, like Stanford Alpaca, cost a few hundred dollars to fine-tune. It's only when you want to pretrain a large language model from scratch that you need thousands of GPUs and millions in funding.

>>luckyt+FU1
The question is what happens once you want to transition from your RTX 4090 to a business. It might be cute to generate 10 tokens per second or whatever you can get with whatever model you have to delight your family and friends. But once you want to scale that out into a genuine product - you're up against the ramp. Even a modest inference rig is going to cost a chunk of change in the hundreds of thousands. You have no real way to validate your business model without making some big investment.

Of course, it is the businesses that find a way to make this work that will succeed. It isn't an impossible problem, it is just a seemingly difficult one for now. That is why I mentioned VC funding as appearing to have more leverage over this market than previous ones. If you can find someone to foot the 250k+ cost (e.g. AI Grant [1] where they offer 250k cash and 350k cloud compute) then you might have a chance.

1. https://aigrant.org/

zlacker