PyTorch for WebGPU

>>mighdo+(OP)
I'm excited about this for probably different reasons than most: I think Typescript could be a more ergonomic way to develop ML models than Python because you can automatically infer and check tensor dimensions while you are writing code! Compare this to the mess of comments you usually see writing pytorch telling you that x is of shape [x, y, z].

  // An empty 3x4 matrix
  const tensorA = tensor([3, 4])
  
  // An empty 4x5 matrix
  const tensorB = tensor([4, 5])

  const good = multiplyMatrix(tensorA, tensorB);
        ^
        Inferred type is Tensor<readonly [3, 5]>
  
  const bad = multiplyMatrix(tensorB, tensorA);
                             ^^^^^^^
                             Argument of type 'Tensor<readonly [4, 5]>' is not 
                             assignable to parameter of type '[never, "Differing 
                             types", 3 | 5]'.(2345)

I prototyped this for PotatoGPT [1] and some kind stranger on the internet wrote up a more extensive take [2]. You can play with an early version on the Typescript playground here [3] (uses a twitter shortlink for brevity)

[1] https://github.com/newhouseb/potatogpt

[2] https://sebinsua.com/type-safe-tensors

[3] https://t.co/gUzzTl4AAN

>>newhou+Zd
Without multidimensional array slicing or operator overloading it seems like Typescript could never be anywhere near as ergonomic as Python for ML, despite its other advantages.

>>modele+0v
Those are niceties and can be implemented with some small hacks. Most big nets do very little slicing. Lots of dimension permutations (transpose, reshape, and friends) but less slicing. I personally use a lot of slicing so will do my best to support a clean syntax.

>>praecl+xw
I've come to believe over the last few years that slicing is one of the most critical parts of a good ML array framework for a number of things and I've used it heavily. PyTorch, if I understand correctly, still doesn't have it right in terms of some forms of slice assignment and the handling of slice objects (please correct me if I'm wrong) though it is leagues better than tensorflow was.

I've written a lot of dataloader and such code over the last number of years, and the slicing was probably the most important (and most hair-pulling) parts for me. I've really debated writing my own wrapper at some point (if it is indeed worth the effort) just to keep my sanity, even if it is as the expense of some speed.

zlacker