In java, double takes 64 bits, but stores (or computes with) numbers unprecisely. E.g. the following code:
double a = 10.125d;
double b = 7.065d;
System.out.println(a-b);
prints out 3.0599999999999996
rather than 3.06
.
So, the question - what about utilizing those 64 bits to store two 32-bit integers (first to represent the whole part, second the decimal part)? Then calculations would be precise, right?
The naive pseudo-code implementation with unhandled decimal transfer:
primitive double {
int wholePart;
int decimalPart;
public double (double other) {
return double (this.wholePart other.wholePart, this.decimalPart other.decimalPart);
}
//other methods in the same fashion
public String toString() {
return wholePart "." decimalPart;
}
}
Is there a reason for Java to store double unprecisely and not to use the implementation mentioned above?
CodePudding user response:
There is one big problem with your solution. int
are signed, therefore it would be able to have negative decimal parts which don't make sense. Other than that you could not store the same range of values with your solution and you would be missing the values Double.NEGATIVE_INFINITY
, Double.NaN
and Double.POSITIVE_INFINITY
. See how floating point are stored in binary e.g. in this SO question to understand why that is or read IEEE 754, which is the standard which defines how floating point numbers are stored in binary.
But yes, generally speaking if you need the precision it's a good idea to work with integer arithmetic instead of floating point arithmetic (again, for the reasons why see above linked question). The easiest way is to just pick another unit/ the smallest unit you'll need.
Assume for example you want to calculate prices in euros €
. If you store them as floats
you'll risk being inaccurate which you really don't want when working with prices. Therefore instead of storing €
amounts, store how many cents (smallest possible unit here) something costs and you'll have eliminated the problem.
For large integer there also is BigInteger
so that approach can also work for large or respectively very small float values.