I was curious about these things:
- Does the C language always store the floating-point numbers in the normalized form? (That is, is the normalization always applied?)
- Does this also hold true for the results obtained after some arithmetic (addition, multiplication)?
- Is it dependent on the language or the hardware - FPU?
It would be really helpful if you can cite any sources. I've looked at the IEEE-754 document, but was not able to find any specific statements regarding implementation.
CodePudding user response:
Does the C language always store the floating-point numbers in the normalized form?
"It depends." As we'll see, it's more the hardware than the C language that determines this.
If the implementation uses something other than IEEE-754, there's not much we can say.
If the implementation does use IEEE-754, then all numbers are always stored normalized except the ones that aren't, namely the subnormals.
Does this also hold true for the results obtained after some arithmetic (addition, multiplication)?
Yes. (More on this below.)
Is it dependent on the language or the hardware - FPU?
It is typically dependent on the hardware. A C program is typically compiled straight to the target machine's floating-point instructions, without any language- or compiler-imposed extra processing. (This is by contrast to, for example, Java, which by does have a language-imposed floating-point definition, which is implemented in part by the JVM.)
The C standard does have an optional section, "Annex F", which specifies a bunch of specific floating-point behavior, conformant with IEEE-754.
Now, if the C implementation adopts Annex F and is conformant with IEEE-754 (typically because the underlying hardware is, too), the answers to your first two questions become even easier. In IEEE-754 arithmetic, there are no ambiguities of representation. Every number that can be represented in normalized form has exactly one normalized representation. Every number that cannot be represented in normalized form, but that can be represented as a subnormal, has exactly one subnormal representation. These constraints apply to every IEEE-754 floating-point number, including (naturally enough) the results of other operations.
The last question, I suppose, is "How many C implementations adopt Annex F?", or, stated another way, "How many processors comply with IEEE-754?" For the CPU's on general-purpose computers (mainframes and personal computers), as far as I know the answer is "All of them". GPU's, on the other hand, are deliberately not quite compatible with IEEE-754 (because they can be more efficient that way). Microprocessors, for "embedded" work, I'm not so sure about. (Often they don't have viable floating-point at all.)
CodePudding user response:
It would be really helpful if you can cite any sources.
C 2018 5.2.4.2.2 3 defines a floating-point representation for a number x as x = sbe Σ1≤=k≤p fkb−k, where s is the sign (±1), b is the base, e is an integer exponent in a range from emin to emax, p is the precision (number of base-b digits), and fk are the base-b digits of the significand.
Then paragraph 4 says:
In addition to normalized floating-point numbers (f1 > 0 if x ≠ 0), floating types may be able to contain other kinds of floating-point numbers, such as subnormal floating-point numbers (x ≠ 0, e = emin, f1 = 0) and unnormalized floating-point numbers (x ≠ 0, e > emin, f1 = 0), and values that are not floating-point numbers, such as infinities and NaNs…
That is it; the C standard is otherwise silent about normalization of floating-point numbers, except that Annex F offers an optional binding to IEC 60559 (effectively IEEE 754). So this answers the questions:
Does the C language always store the floating-point numbers in the normalized form? (That is, is the normalization always applied?)
No.
Does this also hold true for the results obtained after some arithmetic (addition, multiplication)?
No.
Is it dependent on the language or the hardware - FPU?
It is up to each C implementation. C implementations may be influenced by hardware or may adopt software implementations of floating-point arithmetic.