zlacker

"CUDNN API supported by HIP" has a coverage table: https://rocm.docs.amd.com/projects/HIPIFY/en/amd-staging/tab...

ROCm/hipDNN wraps CuDNN on Nvidia and MiOpen on AMD; but hasn't been updated in awhile: https://github.com/ROCm/hipDNN

>>37808036 : conda-forge has various BLAS implementations, including MKL-optimized BLAS, and compatible NumPy and SciPy builds.

BLAS: Basic Linear Algebra Sub programs: https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprogra...

"Using CuPy on AMD GPU (experimental)" https://docs.cupy.dev/en/v13.0.0/install.html#using-cupy-on-... :

  $ sudo apt install hipblas hipsparse rocsparse rocrand rocthrust rocsolver rocfft hipcub rocprim rccl

replies(1): >>HarHar+Uf

>>westur+(OP)
I guess I misunderstood you.

You were asking if this CUDA compatability layer might hold any advantage over HIP (e.g. for use by llama.cpp) ?

I think the answer is no, since HIP includes pretty full-featured support for many of the higher level CUDA-based APIs (cuDNN, cuBLAS, etc), while per the Phoronix article ZLUDA only (currently) has minimal support for them.

I wouldn't expect ZLUDA to provide any performance benefit over HIP either, since on AMD hardware HIP is just a pass-thru to MIOpen (AMD's equivalent to cuDNN), rocBLAS, etc.