I am looking to check if a double
value can be represented as an int
(or the same for any pair of floating point an integer types). This is a simple way to do it:
double x = ...;
int i = x; // potentially undefined behaviour
if ((double) i != x) {
// not representable
}
However, it invokes undefined behaviour on the marked line, and triggers UBSan (which some will complain about).
Questions:
- Is this method considered acceptable in general?
- Is there a reasonably simple way to do it without invoking undefined behaviour?
Clarifications, as requested:
The situation I am facing right now involves conversion from double
to various integer types (int
, long
, long long
) in C. However, I have encountered similar situations before, thus I am interested in answers both for float -> integer and integer -> float conversions.
Examples of how the conversion may fail:
- Float -> integer conversion may fail is the value is not a whole number, e.g.
3.5
. - The source value may be out of the range of the target type (larger or small than max and min representable values). For example
1.23e100
. - The source values may be -Inf or NaN, NaN being tricky as any comparison with it returns false.
- Integer -> float conversion may fail when the float type does not have enough precision. For example, typical
double
have 52 binary digits compared to 63 digits in a 64-bit integer type. For example, on a typical 64-bit system,(long) (double) ((1L << 53) 1L)
. - I do understand that
1L << 53
(as opposed to(1L << 53) 1
) is technically exactly representable as adouble
, and that the code I proposed would accept this conversion, even though it probably shouldn't be allowed. - Anything I didn't think of?
CodePudding user response:
Create range limits exactly as FP types
The "trick" is to form the limits without loosing precision.
Let us consider float
to int
.
Conversion of float
to int
is valid (for example with 32-bit 2's complement int
) for -2,147,483,648.9999... to 2,147,483,647.9999... or nearly INT_MIN
-1 to INT_MAX
1.
We can take advantage that integer_MAX
is always a power-of-2 - 1 and integer_MIN
is -(power-of-2) (for common 2's complement).
Avoid the limit of FP_INT_MIN_minus_1
as it may/may not be exactly encodable as a FP.
// Form FP limits of "INT_MAX plus 1" and "INT_MIN"
#define FLOAT_INT_MAX_P1 ((INT_MAX/2 1)*2.0f)
#define FLOAT_INT_MIN ((float) INT_MIN)
if (f < FLOAT_INT_MAX_P1 && f - FLOAT_INT_MIN > -1.0f) {
// Within range.
Use modff() to detect a fraction if desired.
}
More pedantic code would use !isnan(f)
and consider non-2's complement encoding.
CodePudding user response:
Using known limits and floating-point number validity. Check what's inside limits.h
header.
You can write something like this:
#include <limits.h>
#include <math.h>
// Of course, constants used are specific to "int" type... There is others for other types.
if ((isnormal(x)) && (x>=INT_MIN) && (x<=INT_MAX) && (round(x)==x))
// Safe assignation from double to int.
i = (int)x ;
else
// Handle error/overflow here.
ERROR(.....) ;
Code relies on lazy boolean evaluation, obviously.