I was wondering - I've been thinking about switching to AI systems programming (I know, easy task), but from what I understand, industry cloud GPUs are the main winners, right? Nobody's going to pay me (assuming I even had the skills) to optimize for consumer GPUs?
From what I understand, it's not just number + capacity + performance, it's literal core primitives. I don't think any of the "Blackwell" chips like the grace one or rtx 5090 have for example SM pairs in their ISA? And likewise similar fundamental differences between consumer and cloud hopper (where the majority of the perf is the cloud one's ISA?)
So I guess I'm wondering if I should buy a GPU myself or should I just rent on the cloud if I wanted to start getting some experience in this field. How do you even get experience in this normally anyways, do you get into really good schools and into their AI labs which have a lot of funding?
Do you mean the coupled dies on stuff like the B200? An NVidia chip die has many SMs if so.
Do you mean TMEM MMA cooperative execution? I'm guessing that must be it given what the paper is about.
cooperative execution yeah
as you can tell I do not do CUDA for a living :D
> So I guess I'm wondering if I should buy a GPU myself or should I just rent on the cloud if I wanted to start getting some experience in this field. How do you even get experience in this normally anyways, do you get into really good schools and into their AI labs which have a lot of funding?
Unless you have money to throw around, you'd better start working on something, write some code and get them running on a leased GPU, before deciding on a long term plan
People will but probably less, not many people are doing AI at the edge that can pay the mega millions
> And likewise similar fundamental differences between consumer and cloud hopper (where the majority of the perf is the cloud one's ISA?)
I think Hopper was the version where they did a clean split and it’s only for datacenter
> So I guess I'm wondering if I should buy a GPU myself or should I just rent on the cloud if I wanted to start getting some experience in this field. How do you even get experience in this normally anyways, do you get into really good schools and into their AI labs which have a lot of funding?
You can do performance work on any system you have really it’s just that the details change depending on what you’re targeting. You can definitely learn the basics on like a 3060 by following blog posts
This isn't really true.
In this case it's specific to NVidia's tensor matrix multiply-add (MMA) instructions, which lets it use silicon that would otherwise be unusued at that point.
> Why does publishing papers require the latest and greatest GPUs?
You really do need to test these things on real hardware and across hardware. When you are doing unexpected things there are lots of unexpected interaction effects.
As a reminder, the context is "require the latest and greatest GPUs", responding to the parent comment. "General" doesn't mean "you can do this on an Intel Arc GPU" level of general.
That said, my comment could have used a bit more clarity.