zlacker

[parent] [thread] 1 comments
1. mappu+(OP)[view] [source] 2023-01-23 06:06:59
Cgo is a real option here, or at least would be an interesting comparison. The function is likely called with large enough data to amortize the (few ns) overhead and it's certainly more maintainable.

Another option is GOAMD64=v3 / v4 which will enable AVX2 / AVX512 on the regular Go output, although the compiler does almost no autovectorization compared to gccgo/gollvm.

replies(1): >>zhengh+67
2. zhengh+67[view] [source] 2023-01-23 07:47:15
>>mappu+(OP)
It is a pity that CGO isn't in the final benchmark. >_<
[go to top]