How to check if a float is positive denormalized/negative denormalized or not denormalized-CodePudding

How to check if a float is positive denormalized/negative denormalized or not denormalized.

I tried to do:

int is_denorm(float f)
{
  unsigned int x = *(int*)&f; 
  unsigned expMask = (1 << 8) - 1;
  expMask = expMask << 23;
  //now needs to check if the exp is all zero how can I do it
}

CodePudding user response：

check if a float is positive denormalized/negative denormalized or not denormalized

Note that both C and IEEE-754 use subnormal and not denormal.

#include <math.h>

//  1  subnormal
// -1 -subnormal
//  0 not subnormal
int subnormalness(float x) {
  if (fpclassify(x) == FP_SUBNORMAL) {
    return signbit(x) ? -1 : 1;
  }
  return 0;
}

Avoid code like *(int*)&f; and expMask << 23, .... That runs into aliasing concerns, float encoding issues and size of unsigned.

Sometimes 0.0 is desired to be classified like sub-normals

int subnormalzeroness(float x) {
  switch (fpclassify(x))
    case FP_SUBNORMAL: // fall through
    case FP_ZERO:
      return signbit(x) ? -1 : 1;
    }
  }
  return 0;
}

Code such as below works well too when NANs behave per IEEE-754 and fails < comparisons, otherwise append a && !isnan(x) to the return.

int subnormalzeroness_alt(float x) {
  return fabsf(x) < FLT_MIN;  
}

CodePudding user response：

Instead of making assumptions about the representation of float and unsigned int, including size and endianness and encoding, you should use the fpclassify macro defined in <math.h> specifically designed for this purpose:

int is_denorm(float f) {
    return fpclassify(f) == FP_SUBNORMAL;
}

Depending on its argument, fpclassify(x) evaluates to one of the number classification macros:

FP_INFINITE
FP_NAN
FP_NORMAL
FP_SUBNORMAL
FP_ZERO

They represent the mutually exclusive kinds of floating-point values. They expand to integer constant expressions with distinct values. Additional implementation-defined floating-point classifications, with macro definitions beginning with FP_ and an uppercase letter, may also be specified by the implementation.

The signbit macro can be used to extract the sign of a floating point value (of type float, double or long double). Note that signbit(x) evaluates to non zero for negative values and non-values, including -0.0 for which x < 0 would evaluate to false.

Note that your approach has some problems even on architectures using IEEE 754 single precision floats and 32-bit integers with the same endianness:

to avoid aliasing issues, instead of unsigned int x = *(int *)&f; you should write uint32_t x; memcpy(&x, &f, sizeof x);
testing the exponent bits is not sufficient to detect subnormal values as the values 0.0 and -0.0 also have all exponent bits set to 0.

Also note that denormalized is not always the same as subnormal: In the IEEE 754 standard, subnormal refers to non zero numbers smaller in magnitude than normal numbers with an implicit 1 mantissa (denormal is not used anymore in the IEEE 754 standard nor in the C Standard). Other floating point standards with multiple representations may have denormalized numbers for other sets of values.