My understanding is that:
- Signed integer overflow in C is undefined behavior
- Constant expressions are not allowed to contain undefined behavior.
It seems to follow that something like the following should not compile, and indeed on my compiler it doesn't.
template<int n> struct S { };
template<int a, int b>
S<a * b> f()
{
return S<a * b>();
}
int main(int, char **)
{
f<50000, 49999>();
return 0;
}
However, now I try the following instead:
#include <numeric>
template<int n> struct S { };
template<int a, int b>
S<std::lcm(a, b)> g()
{
return S<std::lcm(a,b)>();
}
int main(int, char **)
{
g<50000, 49999>();
return 0;
}
Each of g , clang, and MSVC will happily compile this, despite the fact that
The behavior is undefined if |m|, |n|, or the least common multiple of |m| and |n| is not representable as a value of type
std::common_type_t<M, N>
.
(Source: https://en.cppreference.com/w/cpp/numeric/lcm)
Is this a bug in all three compilers? Or is cppreference wrong about lcm's behavior being undefined if it can't represent the result?
CodePudding user response:
According to [expr.const]/5, "an operation that would have undefined behavior as specified in [intro] through [cpp]" is not permitted during constant evaluation, but:
If E satisfies the constraints of a core constant expression, but evaluation of E would evaluate an operation that has undefined behavior as specified in [library] through [thread], or an invocation of the
va_start
macro ([cstdarg.syn]), it is unspecified whether E is a core constant expression.
We usually summarize this as "language UB must be diagnosed in a context that requires a constant expression, but library UB does not necessarily need to be diagnosed".
The reason for this rule is that because an operation that causes library UB may or may not cause language UB, and it would be difficult for compilers to consistently diagnose library UB even in cases when it doesn't cause language UB. (In fact, even some forms of language UB are not consistently diagnosed by current implementations.)
Some people also refer to language UB as "hard" UB and library UB as "soft" UB, but I don't like this terminology because (in my opinion) it encourages users to think of "code for which it's unspecified whether language UB occurs" as somehow less bad than "code that unambiguously has language UB". But in both cases, the result is that the programmer cannot write a program that executes such code and expect anything to work properly.