zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. celega+im[view] [source] 2024-01-16 13:50:05
>>x1f604+(OP)
On gcc 13, the difference in assembly between the min(max()) version and std::clamp is eliminated when I add the -ffast-math flag. I suspect that the two implementations handle one of the arguments being NaN a bit differently.

https://gcc.godbolt.org/z/fGaP6roe9

I see the same behavior on clang 17 as well

https://gcc.godbolt.org/z/6jvnoxWhb

◧◩
2. gumby+1n[view] [source] 2024-01-16 13:54:31
>>celega+im
You (celegans25) probably know this but here is a PSA that -ffast-math is really -finaccurate-math. The knowledgeable developer will know when to use it (almost never) while the naive user will have bugs.
◧◩◪
3. dahart+Vy[view] [source] 2024-01-16 15:05:15
>>gumby+1n
Why do you say almost never? Don’t let the name scare you; all floating point math is inaccurate. Fast math is only slightly less accurate, I think typically it’s a 1 or maybe 2 LSB difference. At least in CUDA it is, and I think many (most?) people & situations can tolerate 22 bits of mantissa compared to 23, and many (most?) people/situations aren’t paying attention to inf/nan/exception issues at all.

I deal with a lot of floating point professionally day to day, and I use fast math all the time, since the tradeoff for higher performance and the relatively small loss of accuracy are acceptable. Maybe the biggest issue I run into is lack of denorms with CUDA fast-math, and it’s pretty rare for me to care about numbers smaller than 10^-38. Heck, I’d say I can tolerate 8 or 16 bits of mantissa most of the time, and fast-math floats are way more accurate than that. And we know a lot of neural network training these days can tolerate less than 8 bits of mantissa.

◧◩◪◨
4. useful+EI[view] [source] 2024-01-16 15:46:10
>>dahart+Vy
> Fast math is only slightly less accurate

'slightly'? Last I checked, -Ofast completely breaks std::isnan and std::isinf--they always return false.

◧◩◪◨⬒
5. xigoi+7e1[view] [source] 2024-01-16 18:15:27
>>useful+EI
They are talking about -ffast-math, not -Ofast.
[go to top]