zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. celega+im[view] [source] 2024-01-16 13:50:05
>>x1f604+(OP)
On gcc 13, the difference in assembly between the min(max()) version and std::clamp is eliminated when I add the -ffast-math flag. I suspect that the two implementations handle one of the arguments being NaN a bit differently.

https://gcc.godbolt.org/z/fGaP6roe9

I see the same behavior on clang 17 as well

https://gcc.godbolt.org/z/6jvnoxWhb

◧◩
2. gumby+1n[view] [source] 2024-01-16 13:54:31
>>celega+im
You (celegans25) probably know this but here is a PSA that -ffast-math is really -finaccurate-math. The knowledgeable developer will know when to use it (almost never) while the naive user will have bugs.
◧◩◪
3. dahart+Vy[view] [source] 2024-01-16 15:05:15
>>gumby+1n
Why do you say almost never? Don’t let the name scare you; all floating point math is inaccurate. Fast math is only slightly less accurate, I think typically it’s a 1 or maybe 2 LSB difference. At least in CUDA it is, and I think many (most?) people & situations can tolerate 22 bits of mantissa compared to 23, and many (most?) people/situations aren’t paying attention to inf/nan/exception issues at all.

I deal with a lot of floating point professionally day to day, and I use fast math all the time, since the tradeoff for higher performance and the relatively small loss of accuracy are acceptable. Maybe the biggest issue I run into is lack of denorms with CUDA fast-math, and it’s pretty rare for me to care about numbers smaller than 10^-38. Heck, I’d say I can tolerate 8 or 16 bits of mantissa most of the time, and fast-math floats are way more accurate than that. And we know a lot of neural network training these days can tolerate less than 8 bits of mantissa.

◧◩◪◨
4. light_+Rr1[view] [source] 2024-01-16 19:05:30
>>dahart+Vy
Nah, you don't deal with floats. You do machine learning which just happens to use floats. I do both numerical computing and machine learning. And oh boy are you wrong!

People who deal with actual numerical computing know that the statement "fast math is only slightly less accurate" is absurd. Fast math is unbounded in its inaccuracy! It can reorder your computations so that something that used to sum to 1 now sums to 0, it can cause catastrophic cancellation, etc.

Please stop giving people terrible advice on a topic you're totally unfamiliar with.

◧◩◪◨⬒
5. thecha+fP1[view] [source] 2024-01-16 20:44:17
>>light_+Rr1
+1. I'm years away from fp-analysis, but do the transcendental expansions even converge in the presence of fast-math? No `sin()`, no `cos()`, no `exp()`, ...
◧◩◪◨⬒⬓
6. dahart+P92[view] [source] 2024-01-16 22:40:01
>>thecha+fP1
Well there are library implementations of fast-math trancendentals that offer bounded error, and a million different fast sine approximation algorithms, so, yes? This is why you shouldn’t listen to FUD. The corner cases are indeed frustrating for a few people, but most never hit them, and the world doesn’t suddenly break when fast math is enabled. I am paid to do some FP analysis, btw.
[go to top]