zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. fooker+Ah[view] [source] 2024-01-16 13:18:31
>>x1f604+(OP)
If you benchmark these, you'll likely find the version with the jump edges out the one with the conditional instruction in practice.
◧◩
2. pclmul+xn[view] [source] 2024-01-16 13:57:56
>>fooker+Ah
Compilers often under-generate conditional instructions. They implicitly assume (correctly) that most branches you write are 90/10 (ie very predictable), not 50/50. The branches that actually are 50/50 suffer from being treated as being 90/10.
◧◩◪
3. fooker+Tx[view] [source] 2024-01-16 14:59:59
>>pclmul+xn
The branches in this example are not 50/50.

Given a few million calls of clamp, most would be no-ops in practice. Modern CPUs are very good at dynamically observing this.

◧◩◪◨
4. pclmul+x93[view] [source] 2024-01-17 06:26:04
>>fooker+Tx
Do you know that for a fact? For all calls of clamp? I have definitely used min and max when they are true 50/50s and I assume clamp also gets some similar use.
[go to top]