In chapter 3.4.2 of The C Programming Language, Bjarne Stroustrup says The point of adding int
s in a double
would be to gracefully handle a number larger than the largest int
. Are double
s guaranteed to be able to hold the largest int
s?
CodePudding user response:
It's not guaranteed.
If you assume double
is in fact implemented as a IEEE 754 binary64 type (it should be), then it has significand precision of 53 bits (that's the number of bits of integer precision it provides). Once you exceed 53 bits though, you'll start losing data (initially, it can only represent every other integer value, then every fourth value, then every eighth, etc., as it relies more and more on the exponent to scale it).
On most systems, int
is 32 bits or less, so a single int
addition can't exceed the representational ability of the double
. But there are systems in which int
is 64 bits, and on those systems, even without addition getting involved, a large int
value can overflow the representational precision of a double
; you'll get something close to the right value, but it won't be exactly correct.
In practice, when this situation arises, you probably want to use int64_t
or the like; double
will be more portable (there's no guarantee a given system implements a 64 bit integer type), but it may be slower (on systems without a floating point coprocessor) and it will be inherently less precise than a true 64 bit integer type.
I suspect Bjarne Stroustrup's comment dates back to the days when virtually all systems had native integer handling of 32 bits or fewer, so:
- Not all of them provided a 64 bit integer type at all, and
- When they did provide a 64 bit integer type, it was implemented in software by the compiler performing several 32 bit operations on paired 32 bit values to produce the equivalent of a 64 bit operation, making it much slower than a single floating point operation (assuming the system had a floating point coprocessor)
That sort of system still exists today mostly in the embedded development space, but for general purpose computers, it's pretty darn rare.
Alternatively, the computation in question may be one for which the result is likely to be huge (well beyond what even a 64 bit integer can hold) and some loss of precision is tolerated; an IEEE 754 binary64 type can technically represent values as high as 2 ** 1023
(the gaps between representable values just get nuts at that point), and could usefully store the result of summing a bunch of "large enough to not get lost due to precision loss (variable definition, depending on magnitude of result)" 32 bit integers up into the high two digit or low three digit bit counts.
CodePudding user response:
Are
double
s guaranteed to be able to hold the largestint
s?
No, primarily because the sizes and particular features of double
and int
are not guaranteed by the C standard.
The format commonly used for double
is IEEE-754 “double precision,” also called binary64. The set of finite numbers this format represents is { M•2e for integers M and e such that −253 < M < 253 and −1074 ≤ e ≤ 971 }. The largest set of consecutive integers in this set is the integers from −253 to 253, inclusive. 253 1 is not representable in this format.
Therefore, if int
is 54 bits or fewer, so it has one sign bit and 53 or fewer value bits, every int
value can be represented as a double
in this format. If int
is wider than 54 bits, it can represent 253 1 but this double
format cannot.
CodePudding user response:
Int is used to store 32 bit two’s complement integer, and double usually used to store 64 bit double precision floating point value.
In int you can store 4 bytes, and in double 8 bytes. So yes, you can store any int number into double.