zlacker

Is it really worth the trouble if you're not building on top of something like LLVM which already has a vectorizer? We're still waiting for the mythical sufficiently-smart-vectorizer, even the better ones are still extremely brittle, and any serious high-performance work still does explicit SIMD rather than trying to coax the vectorizer into cooperating.

I'd rather see new languages focus on making better explicit SIMD abstractions a la Intels ISPC, rather than writing yet another magic vectorizer that only actually works in trivial cases.

replies(2): >>mgauna+f8 >>neonsu+Bh

>>jshear+(OP)
any of the polyhedral frameworks is reasonably good at splitting loop nests into parallelizable ones.

Then it's just a codegen problem.

But yes, ultimately, the user needs to be aware of how the language works, what is parallelizable and what isn't, and of the cost of the operations that they ask their computer to execute.

>>jshear+(OP)
C# is doing that :)

https://learn.microsoft.com/en-us/dotnet/api/system.runtime....

Examples of usage:

- https://github.com/U8String/U8String/blob/main/Sources/U8Str...

- https://github.com/nietras/1brc.cs/blob/main/src/Brc/BrcAccu...

- https://github.com/dotnet/runtime/blob/main/src/libraries/Sy...

(and many more if you search github for the uses of Vector128/256<byte> and the like!)

replies(1): >>pjmlp+Yq1

>>neonsu+Bh
Java as well, unfortunelly it will be kept around as preview until Valhala arrives (if ever).