zlacker

[parent] [thread] 1 comments
1. jeffbe+(OP)[view] [source] 2024-01-16 14:06:15
FYI. https://quick-bench.com/q/sK9t9GoFDRkx9XxloUUbB8Q3ht4'

Using this microbenchmark on an Intel Sapphire Rapids CPU, compiled with march=k8 to get the older form, takes ~980ns, while compiling with march=native gives ~570ns. It's not at all clear that the imperfection the article describes is really relevant in context, because the compiler transforms this function into something quite different.

replies(1): >>fooker+8a
2. fooker+8a[view] [source] 2024-01-16 15:03:25
>>jeffbe+(OP)
With random test cases, branch prediction can't help.
[go to top]