zlacker

[return to "Std: Clamp generates less efficient assembly than std:min(max,std:max(min,v))"]
1. cmovq+FH[view] [source] 2024-01-16 15:42:56
>>x1f604+(OP)
Depending on the order of the arguments to min max you'll get an extra move instruction [1]:

std::min(max, std::max(min, v));

        maxsd   xmm0, xmm1
        minsd   xmm0, xmm2
std::min(std::max(v, min), max);

        maxsd   xmm1, xmm0
        minsd   xmm2, xmm1
        movapd  xmm0, xmm2
For min/max on x86 if any operand is NaN the instruction copies the second operand into the first. So the compiler can't reorder the second case to look like the first (to leave the result in xmm0 for the return value).

The reason for this NaN behavior is that minsd is implemented to look like `(a < b) ? a : b`, where if any of a or b is NaN the condition is false, and the expression evaluates to b.

Possibly std::clamp has the comparisons ordered like the second case?

[1]: https://godbolt.org/z/coes8Gdhz

◧◩
2. x1f604+lX[view] [source] 2024-01-16 16:55:50
>>cmovq+FH
I think the libstdc++ implementation does indeed have the comparisons ordered in the way that you describe. I stepped into the std::clamp() call in gdb and got this:

    ┌─/usr/include/c++/12/bits/stl_algo.h──────────────────────────────────────────────────────────────────────────────────────
    │     3617     \*  @pre `_Tp` is LessThanComparable and `(__hi < __lo)` is false.
    │     3618     \*/
    │     3619    template<typename _Tp>
    │     3620      constexpr const _Tp&
    │     3621      clamp(const _Tp& __val, const _Tp& __lo, const _Tp& __hi)
    │     3622      {
    │     3623        __glibcxx_assert(!(__hi < __lo));
    │  >  3624        return std::min(std::max(__val, __lo), __hi);
    │     3625      }
    │     3626
◧◩◪
3. cmovq+y01[view] [source] 2024-01-16 17:09:17
>>x1f604+lX
Thanks for sharing. I don't know if the C++ standard mandates one behavior or another, it really depends on how you want clamp to behave if the value is NaN. std::clamp returns NaN, while the reverse order returns the min value.
◧◩◪◨
4. cornst+Nq1[view] [source] 2024-01-16 19:01:36
>>cmovq+y01
From §25.8.9 Bounded value [alg.clamp]:

> 2 Preconditions: `bool(comp(proj(hi), proj(lo)))` is false. For the first form, type `T` meets the Cpp17LessThanComparable requirements (Table 26).

> 3 Returns: `lo` if `bool(comp(proj(v), proj(lo)))` is true, `hi` if `bool(comp(proj(hi), proj(v)))` is true, otherwise `v`.

> 4 [Note: If NaN is avoided, `T` can be a floating-point type. — end note]

From Table 26:

> `<` is a strict weak ordering relation (25.8)

◧◩◪◨⬒
5. rahkii+y12[view] [source] 2024-01-16 21:50:26
>>cornst+Nq1
Does that mean NaN is undefined behavior for clamp?
◧◩◪◨⬒⬓
6. cornst+H92[view] [source] 2024-01-16 22:39:35
>>rahkii+y12
My interpretation is that yes, passing NaN is undefined behavior. Strict weak ordering is defined in 25.8 Sorting and related operations [alg.sorting]:

> 4 The term strict refers to the requirement of an irreflexive relation (`!comp(x, x)` for all `x`), and the term weak to requirements that are not as strong as those for a total ordering, but stronger than those for a partial ordering. If we define `equiv(a, b)` as `!comp(a, b) && !comp(b, a)`, then the requirements are that `comp` and `equiv` both be transitive relations:

> 4.1 `comp(a, b) && comp(b, c)` implies `comp(a, c)`

> 4.2 `equiv(a, b) && equiv(b, c)` implies `equiv(a, c)`

NaN breaks these relations, because `equiv(42.0, NaN)` and `equiv(NaN, 3.14)` are both true, which would imply `equiv(42.0, 3.14)` is also true. But clearly that's not true, so floating point numbers do not satisfy the strict weak ordering requirement.

The standard doesn't explicitly say that NaN is undefined behavior. But it does not define the behavior for when NaN is used with `std::clamp()`, which I think by definition means it's undefined behavior.

[go to top]