Home > Blockchain >  Behavior of back and forth shift of small types(char and short)
Behavior of back and forth shift of small types(char and short)

Time:11-08

Suppose i want to set the first i bits of a variable c to zero. One of the ways to do it is to shift left i bits and then shift right the same amount. Here is a simple program that does this:

#include <iostream>

int main() {
    using type = unsigned int;
    type c, i;
    std::cin >> c >> i;
    c = (c << i) >> i;
    std::cout << c << "\n";
    return 0;
}

But when the type is unsigned short or unsigned char, this does not work, and c stays unchanged. From one side, it is totally expectable since we know that registers are at least 32 bits wide and shifting one or two bytes back and forth won't set leftmost bits to zero. But the question is: how does such behavior comply with the standard and the definition of operator<<? What is the reason for c = (c << i) >> i; not behaving same as c <<= i; c >>= i; from the point of the language? Is it even defined behavior, and if yes, are there other examples presenting different behavior between semantically equivalent code?(Or why aren't this two lines equivalent?)
I also looked at the generated assembly, and with -O2 it looks more or less like this for any type:

    sall    %cl, %esi
    shrl    %cl, %esi

But if we make i constant, then g masks ints with 2^(n_bits - i) - 1, BUT never bothers generating any instructions for shorts and chars and prints them right after getting from cin. So, it definetely knows how it works and hence this behavior should be documented somewhere, even though i couldn't find anything.

P.S. Of course there are more reliable ways to set required bits to zero, e.g the one gcc uses when knows i, but this question is more about rules of behavior rather than setting bitfields.

CodePudding user response:

how does such behavior comply with the standard and the definition of operator<<?

The behaviour that you observe conforms to the standard.

Is it even defined behavior

Yes, it is defined (assuming i isn't too great so as to cause overflow; You won't be able to set all bits to zero using this method).

why aren't this two lines equivalent?

Because there are no arithmetic operations for integer types of lower rank than int in C , and all arithmetic operands of smaller types are implicitly converted to signed int. Such implicit conversion is called a promotion.

The behaviour of signed right shift and unsigned right shift are different. Signed right shift extends the left most bit such that the sign remains the same, while unsigned right shift pads the left most bits with zero.

The second version behaves differently because the the intermediate result has the smaller unsigned type while the intermediate result in the first version is the promoted signed int (on systems where short and char are smaller than int).

  • Related