Home > Software engineering >  Getting unexpected results when shifting a 32-bit integer more than 32 times
Getting unexpected results when shifting a 32-bit integer more than 32 times

Time:04-16

I would like clarification on why some bit shifting is doing something I wouldn't expect.

I developed a function, setBitsVersion1 that sets bits to 1 in an array of 32 bit integers using the regular int primitive type. So this means that bit positions [0, 31] are for the first integer in the array, bit positions [32, 63] are for the second integer in the array etc. For example, position 10 is position 10 at the first integer, but position 35 is really position 3 in the second integer. So, if bitPosition were 35, then line 4 would become a[1] = a[1] | one << 3 thus setting the bit at position 3. This function behaves as expected, and you can see in the output when it prints the integers, the integer that prints is correct since the bit for the binary 2^3 position was set.

However, the next part is where I'm confused. I wrote a second implementation, setBitsVersion2, that gets rid of line line 3 so the original bitPosition is never modified to be the correct bit position at a particular index. So, using 35 again the line would become a[1] = a[1] | one << 35. It's my understanding that left shifting 32 bits or more is undefined behavior. I'm not exactly sure what I would expect to print, but I definitely wouldn't of expected it to set the bit at position 3 because 1 << 35 is not 1 << 3 and shifting by 35 times is undefined behavior. Maybe I would've expected the integer to be zero, since in the past when I've shifted integers 32 times or more, it just ends up being 0. Regardless, I'm just confused as to why doing 1 << 35 produces the same results 1 << 3 since I would've expected it not to work and would like an explanation.

And in case the kind of machine/compiler I'm using is relevant, then I'm on a Windows computer using Microsoft Visual Studio 2019. Though, I also tested it out on Linux using gcc and got the same results.

CODE

#include <stdio.h>

void setBitVersion1(int* a, int bitPosition);
void setBitVersion2(int* a, int bitPosition);
void printArray(int* a, int size);

int main(int argc, char* argv[]) {
    printf("Correct implementation - setBitsVersion1 - results are correct and what is expected\n");
    int a1[5] = { 0 };
    int a1Size = sizeof(a1) / sizeof(*a1);
    setBitVersion1(a1, 35);
    setBitVersion2(a1, 67);
    printArray(a1, a1Size);
    printf("\n\n");

    printf("Incorrect version - setBitsVersion2 - still produces the same results?\n");
    int a2[5] = { 0 };
    int a2Size = sizeof(a2) / sizeof(*a2);
    setBitVersion1(a2, 35);
    setBitVersion2(a2, 67);
    printArray(a2, a2Size);

    return 0;
}

void setBitVersion1(int* a, int bitPosition) {
    int one = 1;                                // line 1
    int index = bitPosition / 32;               // line 2
    bitPosition = bitPosition % 32;             // line 3
    a[index] = a[index] | one << bitPosition;   // line 4
}

void setBitVersion2(int* a, int bitPosition) {
    int one = 1;
    int index = bitPosition / 32;
    a[index] = a[index] | one << bitPosition;
}

void printArray(int* a, int size) {
    for (int i = 0; i < size; i  )
        printf("%d\n", a[i]);
}

OUTPUT

Correct implementation - setBitsVersion1 - results are correct and what is expected
0
8
8
0
0


Incorrect version - setBitsVersion2 - still produces the same results?????
0
8
8
0
0

CodePudding user response:

more is undefined behavior. I'm not exactly sure what I would expect to print, but I definitely wouldn't of expected it to set the bit at position 3 because 1 << 35 is not 1 << 3 and shifting by 35 times is undefined behavior.

This is exactly how you should approach undefined behavior. Nice.

why doing 1 << 35 produces the same results 1 << 3 since I would've expected it not to work

Undefined behavior means that you have no expectations, not that it "will not work". It may "work" in an unpredictable manner, it may crash, it may spawn nasal demons. No one knows.

You can inspect your specific compiler assembly output for the specific options you have used, and you can discover, for example using godbolt, that your compiler on x86_64 happens to use sal edx, cl assembly instruction for left shifting. Remember, that there is no guarantee of any kind, and in case of undefined behavior, anything can happen. As you noted - you can't "expect".

From https://c9x.me/x86/html/file_module_x86_id_285.html I can read:

The count is masked to 5 bits, which limits the count range to 0 to 31

So even if a value greater than 31 is stored inside CL register, it is marked to 5 bits anyway on this specific architecture.

Note, that the compiler, given for example different compiler options, undef different optimization levels in particular, or a different version of the compiler might generate a different instruction to do the shifting.


You should use uint32_t.

You are using an int type to represent any kind of bit values pattern. Your setBitVersion1 has undefined behavior, the standard says https://port70.net/~nsz/c/c11/n1570.html#6.5.7p4 :

The result of E1 << E2 is .... If E1 has a signed type and nonnegative value, and E1 x 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

one << bitPosition when bitPosition = 31 results in... 2^31, which is greater than an int can represent INT_MAX = 2^31 - 1. This is undefined behavior - your code is actually safe for bitPosition lower or equal 30.

int and unsiggned int types have at least 16 bits. They have 16 bits on some compilers for small microcontrollers.

Use intX_t types to represent specific count of types. Use unsigned types with << >>.

CodePudding user response:

1 << 35 can produce 1 << 3 because the machine instruction for shifting a 32-bit value has only five bits in the field for the shift amount. When 35 (binary 100011) is used, the processor sees only five bits (00011), which is 3, so it shifts by 3 bits. Of course, other behaviors are possible; the compiler might calculate the shift itself using more complicated software while compiling or might use a different instruction with different behavior.

It's my understanding that left shifting 32 bits or more is undefined behavior. I'm not exactly sure what I would expect to print, but I definitely wouldn't [have] expected it to set the bit at position 3 because 1 << 35 is not 1 << 3 and shifting by 35 times is undefined behavior.

Of course you “wouldn't have expected” it to set the bit at position 3 or do anything else in particular, because the C standard does not give you any expectation. To have an expectation, you must have some grounds for expecting, such as statements in the C standard about how it should behave or familiarity with how compilers are written or knowledge of processor instructions. Be on guard for expecting certain things to occur—or even for expecting certain things not to occur—when you do not have grounds for it.

  • Related