zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. celega+im[view] [source] 2024-01-16 13:50:05
>>x1f604+(OP)
On gcc 13, the difference in assembly between the min(max()) version and std::clamp is eliminated when I add the -ffast-math flag. I suspect that the two implementations handle one of the arguments being NaN a bit differently.

https://gcc.godbolt.org/z/fGaP6roe9

I see the same behavior on clang 17 as well

https://gcc.godbolt.org/z/6jvnoxWhb

◧◩
2. gumby+1n[view] [source] 2024-01-16 13:54:31
>>celega+im
You (celegans25) probably know this but here is a PSA that -ffast-math is really -finaccurate-math. The knowledgeable developer will know when to use it (almost never) while the naive user will have bugs.
◧◩◪
3. dahart+Vy[view] [source] 2024-01-16 15:05:15
>>gumby+1n
Why do you say almost never? Don’t let the name scare you; all floating point math is inaccurate. Fast math is only slightly less accurate, I think typically it’s a 1 or maybe 2 LSB difference. At least in CUDA it is, and I think many (most?) people & situations can tolerate 22 bits of mantissa compared to 23, and many (most?) people/situations aren’t paying attention to inf/nan/exception issues at all.

I deal with a lot of floating point professionally day to day, and I use fast math all the time, since the tradeoff for higher performance and the relatively small loss of accuracy are acceptable. Maybe the biggest issue I run into is lack of denorms with CUDA fast-math, and it’s pretty rare for me to care about numbers smaller than 10^-38. Heck, I’d say I can tolerate 8 or 16 bits of mantissa most of the time, and fast-math floats are way more accurate than that. And we know a lot of neural network training these days can tolerate less than 8 bits of mantissa.

◧◩◪◨
4. mort96+ZA[view] [source] 2024-01-16 15:14:44
>>dahart+Vy
The scary thing IMO is: your code might be fine with unsafe math optimisations, but maybe you're using a library which is written to do operations in a certain order to minimise numerical error, and unsafe math operations changes the code which are mathematically equivalent but which results in many orders of magnitude more numerical error. It's probably fine most of the time, but it's kinda scary.
◧◩◪◨⬒
5. dahart+oC[view] [source] 2024-01-16 15:22:16
>>mort96+ZA
It shouldn’t be scary. Any library that is sensitive to order of operations will hopefully have a big fat warning on it. And it can be compiled separately with fast-math disabled. I don’t know of any such libraries off the top of my head, and it’s quite rare to find situations that result in orders of magnitude more error, though I grant you it can happen, and it can be contrived pretty easily.
◧◩◪◨⬒⬓
6. mort96+KJ[view] [source] 2024-01-16 15:51:17
>>dahart+oC
I don't typically thoroughly read through the documentation for all the dependencies which my dependencies are using.

But you're correct that it's probably usually fine in practice.

◧◩◪◨⬒⬓⬔
7. dahart+7N[view] [source] 2024-01-16 16:05:41
>>mort96+KJ
That’s fair. Ideally transitive dependencies should be completely hidden from you. Hopefully the author of the library you include directly has heeded the instructions of libraries they depend on.

Hey I grant and acknowledge that using fast-math carries a little risk of surprises, we don’t necessarily need to try to think of corner cases. I’m mostly pushing back a little because using floats at all carries almost as much risk. A lot of people seem to use floats without knowing how inaccurate floats are, and a lot of people aren’t doing precision analysis or handling the exceptional cases… and don’t really need to.

◧◩◪◨⬒⬓⬔⧯
8. ska+5d1[view] [source] 2024-01-16 18:09:04
>>dahart+7N
> A lot of people seem to use floats without knowing how inaccurate floats are,

Small nit, but floats aren't inaccurate, they have non uniform precision. Some float operations can be inaccurate, but that's rather path dependent...

One problem with -ffast-math is that a) it sounds appealing and b) people don't understand floats, so lots of people turn it on without understanding what it does, and that can introduce subtle problems in code they didn't write.

Sometimes in computational code it makes sense e.g. to get rid of denorms, but a very small fraction of programmers understand this properly, or ever will.

I wish they had named it something scary sounding.

◧◩◪◨⬒⬓⬔⧯▣
9. dahart+iH1[view] [source] 2024-01-16 20:08:13
>>ska+5d1
I am talking about float operations, of course. And they’re all inaccurate, generally speaking, because they round. Fast math rounding error is not much larger than rounding error without fast mast.
[go to top]