Multiplication of two packed signed integers in one-CodePudding

The Stockfish chess engine needs to store, for its evaluation, both an endgame score and a middlegame score.

Instead of storing them separately, it packs them into one int. The middlegame score is stored in the lower 16 bits. The endgame score is stored in the higher 16 bits, as-is if the middlegame score is positive or minus one if it is negative.

This has the advantage that operations (addition, subtraction, negation and multiplication) can be done for both numbers in parallel.

Here is the code:

/// Score enum stores a middlegame and an endgame value in a single integer (enum).
/// The least significant 16 bits are used to store the middlegame value and the
/// upper 16 bits are used to store the endgame value. We have to take care to
/// avoid left-shifting a signed int to avoid undefined behavior.
enum Score : int { SCORE_ZERO };

constexpr Score make_score(int mg, int eg) {
  return Score((int)((unsigned int)eg << 16)   mg);
}

/// Extracting the signed lower and upper 16 bits is not so trivial because
/// according to the standard a simple cast to short is implementation defined
/// and so is a right shift of a signed integer.
inline Value eg_value(Score s) {
  union { uint16_t u; int16_t s; } eg = { uint16_t(unsigned(s   0x8000) >> 16) };
  return Value(eg.s);
}

inline Value mg_value(Score s) {
  union { uint16_t u; int16_t s; } mg = { uint16_t(unsigned(s)) };
  return Value(mg.s);
}

#define ENABLE_BASE_OPERATORS_ON(T)                                \
constexpr T operator (T d1, int d2) { return T(int(d1)   d2); }    \
constexpr T operator-(T d1, int d2) { return T(int(d1) - d2); }    \
constexpr T operator-(T d) { return T(-int(d)); }                  \
inline T& operator =(T& d1, int d2) { return d1 = d1   d2; }       \
inline T& operator-=(T& d1, int d2) { return d1 = d1 - d2; }

ENABLE_BASE_OPERATORS_ON(Score)

/// Only declared but not defined. We don't want to multiply two scores due to
/// a very high risk of overflow. So user should explicitly convert to integer.
Score operator*(Score, Score) = delete;

/// Division of a Score must be handled separately for each term
inline Score operator/(Score s, int i) {
  return make_score(mg_value(s) / i, eg_value(s) / i);
}

/// Multiplication of a Score by an integer. We check for overflow in debug mode.
inline Score operator*(Score s, int i) {

  Score result = Score(int(s) * i);

  assert(eg_value(result) == (i * eg_value(s)));
  assert(mg_value(result) == (i * mg_value(s)));
  assert((i == 0) || (result / i) == s);

  return result;
}

I understand how addition, subtraction and negation work, but what I have trouble understanding is multiplication. How does multiplying the integer multiplies both the endgame and the middlegame scores together correctly?

CodePudding user response：

Multiplying two packed integers doesn't work. But multiplying by a constant works, as long as you have enough padding bits.

Imagine it in base ten, it becomes easier:

23 and 31 packed with a thousand factor: 230031

Let's multiply by 3, we would get 69 and 93

And 230031 * 3 is 690093. So, you can do both multiplication at once, as long as the number in the lowest bits doesn't overflow over the highest-bits number.

The same magic happens if you look at your number in hexadecimal. But, to me, it is easier to understand in decimal.

CodePudding user response：

The reason that multiplication works here, even for negative middlegame score, is, despite the document says:

The least significant 16 bits are used to store the middlegame value and the upper 16 bits are used to store the endgame value.

This statement isn't entirely true.

As an example, if you have mg = -10, ed = 3, if we simply take the description in the document, we should get something like:

mg =                     | 1111 1111 1111 0110
ed = 0000 0000 0000 0011 | 
------------------------- --------------------
     0000 0000 0000 0011 | 1111 1111 1111 0110

However, if we look at the logic in make_score:

Score((int)((unsigned int)eg << 16)   mg)

Note that we are actually promoting the number to int, so what we actually get is:

     1111 1111 1111 1111 | 1111 1111 1111 0110
     0000 0000 0000 0011 | 0000 0000 0000 0000
------------------------- --------------------
 (1) 0000 0000 0000 0010 | 1111 1111 1111 0110
                       ^
             this bit is different

Now go back to the math part, what we have is essentially: 3⋅2¹⁶-10.

Now say we multiply it with 37, we will get (3⋅2¹⁶⋅37)-(10⋅37), or 111⋅2¹⁶-370, or in binary:

     0000 0000 0110 1111 | 0000 0000 0000 0000
   - 0000 0000 0000 0000 | 0000 0001 0111 0010
------------------------- --------------------

Since the lowest 16 bits of 111⋅2¹⁶ are all 0s, this operation is basically having the higher 16 bits -1, and the lower 16 bits will be the bitwise negation of 370 then 1, or:

     0000 0000 0110 1110 | 1111 1110 1000 1110

In fact, as long (unsigned)(mg⋅k) is less than 2¹⁶, then doing multiplication in the lower bits will never affect the higher bits.