zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. fooker+Ah[view] [source] 2024-01-16 13:18:31
>>x1f604+(OP)
If you benchmark these, you'll likely find the version with the jump edges out the one with the conditional instruction in practice.
◧◩
2. jeffbe+yo[view] [source] 2024-01-16 14:06:15
>>fooker+Ah
FYI. https://quick-bench.com/q/sK9t9GoFDRkx9XxloUUbB8Q3ht4'

Using this microbenchmark on an Intel Sapphire Rapids CPU, compiled with march=k8 to get the older form, takes ~980ns, while compiling with march=native gives ~570ns. It's not at all clear that the imperfection the article describes is really relevant in context, because the compiler transforms this function into something quite different.

◧◩◪
3. fooker+Gy[view] [source] 2024-01-16 15:03:25
>>jeffbe+yo
With random test cases, branch prediction can't help.
[go to top]