https://gcc.godbolt.org/z/fGaP6roe9
I see the same behavior on clang 17 as well
I deal with a lot of floating point professionally day to day, and I use fast math all the time, since the tradeoff for higher performance and the relatively small loss of accuracy are acceptable. Maybe the biggest issue I run into is lack of denorms with CUDA fast-math, and it’s pretty rare for me to care about numbers smaller than 10^-38. Heck, I’d say I can tolerate 8 or 16 bits of mantissa most of the time, and fast-math floats are way more accurate than that. And we know a lot of neural network training these days can tolerate less than 8 bits of mantissa.
* It links in an object file that enables denormal flushing globally, so that it affects all libraries linked into your application, even if said library explicitly doesn't want fast-math. This is seriously one of the most user-hostile things a compiler can do.
* The results of your program will vary depending on the exact make of your compiler and other random attributes of your compile environment, which can wreak havoc if you have code that absolutely wants bit-identical results. This doesn't matter for everybody, but there are some domains where this can be a non-starter (e.g., multiplayer game code).
* Fast-math precludes you from being able to use NaN or infinities, and often even being able to defensively test for NaN or infinity. Sure, there are times where this is useful, but an option you might generally prefer to suggest for an uninformed programmer would rather be a "floating-point code can't overflow" option rather than "infinity doesn't exist and it's UB if it does exist".
* Fast-math can cause hard range guarantees to fail. Maybe you've got code that you can prove that, even with rounding error, the result will still be >= 0. With fast-math, the code might be adjusted so that the result is instead, say, -1e-10. And if you pass that to a function with a hard domain error at 0 (like sqrt), you now go from the result being 0 to the result being NaN. And see above about what happens when you get NaN.
Fast-math is a tradeoff, and if you're willing to except the tradeoff it offers, it's a fine option to use. But most programmers don't even know what the tradeoffs are, and the failure mode can be absolutely catastrophic. It's definitely an option that is in the "you must be this knowledgeable to use" camp.
> Fast-math can cause hard range guarantees to fail. Maybe you’ve got code that you can prove that, even with rounding error, the result will still be >= 0.
Floats do this too, it’s pretty routine to bump into epsilon out-of-range issues without fast-math. Most people don’t prove things about their rounding error, and if they do, it’s easy for them to account for 3 ULPs of fast-math error compared to 1/2 ULP for the more accurate operations. Like, nobody who knows what they’re doing will call sqrt() on a number that is fresh out of a multiplier and might be anywhere near zero without testing for zero explicitly, right? I’m sure someone has done it, but I’ve never seen it, and it ranks high on the list of bad ideas even if you steer completely clear of fast-math, no?
I guess I just wanted to resist the unspecific parts of the FUD just a little bit. I like your list a lot because it’s specific. Fast-math does carry some additional risks for accuracy sensitive code, and clearly as you and others showed, can infect and impact your whole app, and it can sometimes lead to situations where things break that wouldn’t have happened otherwise. But I think in the grand scheme these situations are quite rare compared to how often people mess up regular floating point math. For a very wide swath of people doing casual arithmetic, fast-math is not likely to cause more problems than floats cause, but it’s fair to want to be careful and pay attention.
and yet, for audio processing, this is an option that most DAWs either implement silently, or offer users the choice, because denormals are inevitable in reverb tails and on most Intel processors they slow things by orders of magnitude.
My DAW uses both "denormals are zero" and "flush denormals to zero" to try to avoid them; it also offers a "DC Bias" option where extremely small values are added to samples to avoid denormals.