zlacker

Its really time for some competition. Either AMD or some chinese company like 'more threads' need to speed up and get something on the market to break the Nvidia dominance. Nvidia is showing already some nasty typical evil behavior that has to be stopped. I know not easy with fully booked partners at Samsung/tmct/etc

replies(3): >>ukd1+q >>tikkun+5E >>bushba+gP

>>holodu+(OP)
https://tinygrad.org is trying something around this; currently working on getting AMD GPUs to get on MLPerf. Info on what they're up to / why is mostly here - https://geohot.github.io/blog/jekyll/update/2023/05/24/the-t... - though there are some older interesting bits too.

replies(2): >>blitza+W7 >>qumpis+y9

>>ukd1+q
AMD / Intel should be throwing them 10's or 100's millions (or jsut straight up human hours of work) to make that work. If / when it does it would be 10's or 100's of billions of (additional) market cap for them.

replies(2): >>spider+ve >>paulmd+Jd1

>>ukd1+q
Didn't they abandon development of AMD related software?

replies(1): >>spider+rd

>>qumpis+y9
Where did you get that? They just received a shitload of GPU's and it appears AMD is actively cooperating: https://twitter.com/realGeorgeHotz/status/168616581138659737...

replies(3): >>qumpis+Bh >>varels+yj >>jorlow+Vu

>>blitza+W7
So why doesn't AMD invest more in quality software? Everybody knows it's holding them back. Right now a tiny corporation has to jump through hoops to do work they could have done at any point in the last decade. Why don't they pull a great team together and just do the work, if the pay off is that big?

replies(3): >>blitza+tf >>throwa+4t >>Symmet+5H

>>spider+ve
I honestly do not know.

Perhaps they think using GPUs for computation is a passing fad? They hate money? Their product is actually terrible and they dont want to get found out (that one might be true for intel)?

>>spider+rd
I don't remember the exact tweet, but here's one discussion [1]. I guess something changed in the mean time.

[1]_https://www.reddit.com/r/Amd/comments/140uct5/geohot_giving_...

replies(1): >>jorlow+lE

>>spider+rd
George Hotz: because eventually Elon Musk's false promises and over the top shenanigans just weren't doing it for you anymore...

>>spider+ve
They are. FSR tests side by side with DLSS most people can’t tell the difference or pick FSR. Then you tell them which is which and they turn around and say DLSS is better. People are just bias to Nvidia.

They haven’t had gfx card driver issues in years now and people still say “oh I don’t want AMD cos their drivers don’t work”.

replies(4): >>blitza+ky >>jjoona+Wz >>NLPaep+UF >>Thaxll+W11

>>spider+rd
He had abandoned them for about a week but then talked to the CEO and that got things back on track IIRC

>>throwa+4t
Can I run pytorch on it?

replies(1): >>Symmet+BG

>>throwa+4t
This thread isn't about gaming, it's about compute.

replies(1): >>throwa+tK3

>>holodu+(OP)
(I'm the author of the linked post)

Yes, much needed.

Here's a list of possible "monopoly breakers" I'm going to write about in another post - some of these are things people are using today, some are available but don't have much user adoption, some are technically available but very hard to purchase or rent/use, and some aren't yet available:

* Software: OpenAI's Triton (you might've noticed it mentioned in some of "TheBloke" model releases and as an option in the oobabooga text-generation-webui), Modular's Mojo (on top of MLIR), OctoML (from the creators of TVM), geohot's tiny corp, CUDA porting efforts, PyTorch as a way of reducing reliance on CUDA

* Hardware: TPUs, Amazon Inferentia, Cloud companies working on chips (Microsoft Project Athena, AWS Tranium, TPU v5), chip startups (Cerebras, Tenstorrent), AMD's MI300A and MI300X, Tesla Dojo and D1, Meta's MTIA, Habana Gaudi, LLM ASICs, [+ Moore Threads]

The A/H100 with infiniband are still the most common request for startups doing LLM training though.

The current angle I'm thinking about for the post would be to actually use them all. Take Llama 2, and see which software and hardware approaches we can get inference working on (would leave training to a follow-up post), write about how much of a hassle it is (to get access/to purchase/to rent, and to get running), and what the inference speed is like. That might be too ambitious though, I could see it taking a while. If any freelancers want to help me research and write this, email is in my profile. No points for companies that talk a big game but don't have a product that can actually be purchased/used, I think - they'd be relegated to a "things to watch for in future" section.

replies(1): >>emadm+MU

>>qumpis+Bh
[1] links to https://github.com/RadeonOpenCompute/ROCm/issues/2198 which has all the context (driver bugs, vowing to stop using AMD, Lisa Su's response that they're committed to fixing this stuff, a comment that it's fixed)

>>throwa+4t
Have you tried Star Citizen? It’s why I didn’t buy AMD

replies(1): >>throwa+1M3

>>blitza+ky
Yes, but you need ROCm which mostly only runs on AMD's professional cards and requires using the proprietary driver rather than the wonderfully stable open source one.

replies(1): >>slavik+v32

>>spider+ve
In general it's pretty rare for hardware first companies to put out good software. To me it looks like there are structural reasons for this, hardware requires waterfall development which then gets imposed on software, for instance.

>>holodu+(OP)
Give it time. AMD, AWS trainium/inferentia, and Google TPUs all compete here. The gap is mostly with software drivers/support.

>>tikkun+5E
Gaudi2 and Inferentia2 are both good.

We train on A100s, TPUs and... other things now.

replies(1): >>emadm+RW

>>emadm+MU
Also missed in the post is fp8 is really much more efficient

The H100s are actually very good for inference..

>>throwa+4t
FSR is vastly inferior to dlss, not sure what you're talking about, even xless from Intel is better.

As for driver: https://www.tomshardware.com/news/adrenalin-23-7-2-marks-ret...

replies(1): >>throwa+eL3

>>blitza+W7
Intel is very much putting its money where its mouth is with SyCL/OneApi. They are spending a lot of money and advancing a lot faster than AMD, and in many ways it's a better approach (focused on CUDA-style DSL but portable across hardware) rather than just another ecosystem.

(to their credit AMD is also getting serious lately, they put out a listing for like 30 ROCm developers a few weeks after geohot's meltdown, and they were in the process of doing a Windows release (previously linux-only) of ROCm with support for consumer gaming GPUs at the time as well. The message seems to have finally been received, it's a perennial topic here and elsewhere and with the obvious shower of money happening, maybe management was finally receptive to the idea that they needed to step it up.)

>>Symmet+BG
ROCm only officially supports a handful of server or workstation cards, but it works on quite a few others.

I've enabled nearly all GFX9 and GFX10 GPUs as I have packaged the libraries for Debian. I haven't tested every library with every GPU, but my experience has been that they pretty much all work. I suspect that will also be true of GFX11 once we move rocm-hipamd to LLVM 16.

>>jjoona+Wz
I replied to a comment about software.

>>Thaxll+W11
I’m not sure what you’re talking about because side by side testing people can’t tell the difference with the exception of racing games (tho that’s screwed on DLSS 3 too anyway) and if you take screen grabs to look at. So fact is. All the compute in Nvidia cards is a gimmick. If you disagree then you’re wrong. The competitive edge of DLSS is gone.

1 bad driver update is not indicative of anything. Nvidia has had bad driver updates but you’re not shutting all over them. And running Nvidias own drivers on linux is still a pain point.

(And don’t try claim I’m an AMD fanboy when I don’t even have any AMD stuff at the moment. It’s all Intel/Nvidia)

replies(1): >>Thaxll+pI4

>>NLPaep+UF
Based on star citizen telemetry I don’t know what you have against AMD. It seems to rank the 6800/6900 quite high.

>>throwa+eL3
FSR is pretty bad, like it's not even close to DLSS, no one like fsr. And saying that there is not difference is wrong, just play a game with fsr 2.1 and dlss 2 or 3 please.

replies(1): >>throwa+SM4

>>Thaxll+pI4
So you drank the Nvidia coolaid. That’s fine.

replies(1): >>Thaxll+mT4

>>throwa+SM4
I have an AMD card, 6800xt, really good card but fsr is not there yet.

replies(1): >>throwa+bg7

>>Thaxll+mT4
I have a 4070. FSR/DLSS on quality looks the same. It’s only noticeable in Forza Horizon. If you notice it in a non racing game then you’re looking for the differences.