zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. jeffbe+Cj[view] [source] 2024-01-16 13:32:55
>>x1f604+(OP)
Clang generates the shortest of these if you target sandybridge, or x86-64-v3, or later. The real article that's buried in this article is that compilers target k8-generic unless you tell them otherwise, and the features and cost model of opteron are obsolete.

Always specify your target.

◧◩
2. x1f604+Qd7[view] [source] 2024-01-18 09:07:03
>>jeffbe+Cj
Even with -march=x86-64-v4 at -O3 the compiler still generates fewer lines of assembly for the incorrect clamp compared to the correct clamp for this "realistic" code:

https://godbolt.org/z/hd44KjMMn

[go to top]