zlacker

Very impressive work. Would be interesting to do some benchmarks versus PyTorch.

On a side-note, I'm not sure if it is because I've looked at so many autograd engines by now, but it is really cool to see that after the years of different frameworks having been developed, most people seem to agree on some concepts and structure on how to implement something like this. It is pretty easy to dive into this, even without being particularly skilled in JS/TS.

Wondering how such frameworks will look in a couple years.

replies(1): >>westur+1z

>>sva_+(OP)
Could there be something like emscripten-forge/requests-wasm-polyfill for PyTorch with WebGPU? https://github.com/emscripten-forge/requests-wasm-polyfill

How does the performance of webgpu-torch compare to compiling PyTorch to WASM with emscripten and WebGPU?

tfjs benchmarks: Environment > backend > {WASM, WebGL, CPU, WebGPU, tflite} https://tensorflow.github.io/tfjs/e2e/benchmarks/local-bench... src: https://github.com/tensorflow/tfjs/tree/master/e2e/benchmark...

tensorflow/tfjs https://github.com/tensorflow/tfjs

tfjs-backend-wasm https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-...

tfjs-backend-webgpu https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-...

([...], tflite-support, tflite-micro)

From facebookresearch/shumai (a JS tensor library) https://github.com/facebookresearch/shumai/issues/122 :

> It doesn't make sense to support anything besides WebGPU at this point. WASM + SIMD is around 15-20x slower on my machine[1]. Although WebGL is more widely supported today, it doesn't have the compute features needed for efficient modern ML (transformers etc) and will likely be a deprecated backend for other frameworks when WebGPU comes online.

tensorflow rust has a struct.Tensor: https://tensorflow.github.io/rust/tensorflow/struct.Tensor.h...

"ONNX Runtime merges WebGPU backend" https://github.com/microsoft/onnxruntime https://news.ycombinator.com/item?id=35696031 ... TIL about wonnx: https://github.com/webonnx/wonnx#in-the-browser-using-webgpu...

microsoft/onnxruntime: https://github.com/microsoft/onnxruntime

Apache/arrow has language-portable Tensors for cpp: https://arrow.apache.org/docs/cpp/api/tensor.html and rust: https://docs.rs/arrow/latest/arrow/tensor/struct.Tensor.html and Python: https://arrow.apache.org/docs/python/api/tables.html#tensors https://arrow.apache.org/docs/python/generated/pyarrow.Tenso...

Fwiw it looks like the llama.cpp Tensor is from ggml, for which there are CUDA and OpenCL implementations (but not yet ROCm, or a WebGPU shim for use with emscripten transpilation to WASM): https://github.com/ggerganov/llama.cpp/blob/master/ggml.h

Are the recommendable ways to cast e.g. arrow Tensors to pytorch/tensorflow?

FWIU, Rust has a better compilation to WASM; and that's probably faster than already-compiled-to-JS/ES TensorFlow + WebGPU.

What's a fair benchmark?

replies(2): >>westur+rz >>smhx+A31

>>westur+1z
> What's a fair benchmark?

- /? pytorch tensorflow benchmarks webgpu 2023 site:github.com https://www.google.com/search?q=pytorch+tensorflow+benchmark...

- [tfjs benchmarks]

- huggingface/transformers:src/transformers/benchmark https://github.com/huggingface/transformers/tree/main/src/tr...

>>westur+1z
>What's a fair benchmark?

the absolute golden benchmarks are https://github.com/pytorch/benchmark They are a diverse set of userland code taken from github as-is and made into benchmarks.