Some sort of typed 'named tensor' that could be combined with einsum notation at runtime would be awesome, ie. (don't really know TS/JS well but pseudocode)
import { torch } from 'pytorch' as t
import { torch.nn } from 'pytorch' as nn
const tensorA: Tensor[Batch, Seq, Emb] = t.randn([10,10,10]) // initialize tensor
const transformLayer = nn.Einsum((Batch, Seq, Emb),(Emb)->(Batch, Seq))
const tensorB: Tensor[Emb2] = t.randn([20])
const transformedOutput = transformLayer(tensorA, tensorB) // type error: Emb2 does not match Emb
[0]: https://github.com/pytorch/pytorch/issues/26889When I initially started implementing this I was hung up on similar concerns. For example in GPT2/PotatoGPT the MLP player is 4x the width of the residual stream. I went down a rabbit hole of addition and multiplication in Typescript types (the type system is Turing complete, so it's technically possible!) and after crashing my TS language server a bunch I switched tacticts.
Where I ended up was to use symbolic equivalence, which turned out to be more ergonomic anyway, i.e.
type Multiply<A extends number, B extends number> =
number & { label: `${A} * ${B}` }
const Multiply = <A extends number, B extends number>(a: A, b: B) =>
a * b as Multiply<A, B>;
such that tensor([
params.EmbeddingDimensions, // This is a literal with known size
Multiply(4, params.EmbeddingDimensions)] as const)
is inferred as Tensor<readonly [768, Multiply<4, 768>]>
Notably, switching to a more symbolic approach makes it easier for type checking dimensions that can change at runtime, so something like: tensor([Var(tokens.length, 'Sequence Length'),
Multiply<4, Var(tokens.length, 'Sequence Length')>])
infers as Tensor<readonly [
Var<'Sequence Length'>,
Multiply<4, Var<'Sequence Length'>>]>
And you'll get all the same correctness constraints that you would if these were known dimensions.The downside to this approach is that typescript won't know that Multiply<4, Var<'A'>> is equivalent to Multiply<Var<'A'>, 4> but in practice I haven't found this to be a problem.
Finally, on more complicated operators/functions that compose dimensions from different variables Typescript is also very capable, albeit not the most ergonomic. You can check my code for matrix multiplication and Seb's writeup for another example of a zip function).