https://gcc.godbolt.org/z/fGaP6roe9
I see the same behavior on clang 17 as well
I deal with a lot of floating point professionally day to day, and I use fast math all the time, since the tradeoff for higher performance and the relatively small loss of accuracy are acceptable. Maybe the biggest issue I run into is lack of denorms with CUDA fast-math, and it’s pretty rare for me to care about numbers smaller than 10^-38. Heck, I’d say I can tolerate 8 or 16 bits of mantissa most of the time, and fast-math floats are way more accurate than that. And we know a lot of neural network training these days can tolerate less than 8 bits of mantissa.
Can you elaborate? What fast-math can sneak into a library that disabled fast-math at compile time?
> fast-math enables some dynamic initialization of the library that changes the floating point environment in some ways.
I wasn’t aware of this, I would love to see some documentation discussing exactly what happens, can you send a link?
Turn on fast-math, it flips the FTZ/DAZ bit for the entire application. Even if you turned it on for just a shared library!