I recently came across a third party code snippet inside our application that does not make any sense to me. What puzzled me first it that it has been in production for at least 10 years and seemed to work. It is basically a union
of bit fields:
union
{
unsigned longitude : 2; //!< status of the longitude (offset: 0)
unsigned latitude : 2; //!< status of the latitude (offset: 2)
unsigned xPosition : 2; //!< status of the x position relative to starting position (offset: 4)
unsigned yPosition : 2; //!< status of the y position relative to starting position (offset: 6)
// ... Many more 2-bit fields... Total 26 fields
unsigned reserved : 12;
unsigned long status[2]; //!< raw status data
} status;
I am pretty sure that this is a bug and what the author really wanted to write was:
union
{
struct
{
unsigned longitude : 2; //!< status of the longitude (offset: 0)
unsigned latitude : 2; //!< status of the latitude (offset: 2)
unsigned xPosition : 2; //!< status of the x position relative to starting position (offset: 4)
unsigned yPosition : 2; //!< status of the y position relative to starting position (offset: 6)
// ... Many more 2-bit fields... Total 26 fields
unsigned reserved : 12;
};
unsigned status[2]; //!< raw status data (should not be long!)
} status;
The reason it was kind of working is that only one of those 26 bit fields was actually used! But this bug made raised a few questions:
- Is the compiler required to use the same first 2 bits of the same
unsigned
for all fields (despite what the comments say)? - Is there any real usage for a
union
of bit fields? I cannot think of any situation where this would make sense. - If that doesn't make any sense, how come that no compilers we are using (neither Clang, GCC nor MSVC) were issuing any warning?
CodePudding user response:
Is the compiler required to use the same first 2 bits of the same unsigned for all fields (despite what the comments say)?
No. C 2018 6.7.2.1 says “An implementation may allocate any addressable storage unit large enough to hold a bit-field… The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined…”
It does not say the addressable storage unit will be the same for all bit-fields of the same size. If it did, then all the union bit-field members of the same size would have to use the same bits, and certainly any reasonable C implementation would do so.
However, consider bit-fields of different sizes. It is reasonable that a compiler would allocate a one-byte storage unit for a bit-field of 2 bits and a four-byte storage unit for a bit-field of 17 bits. If it is a little-endian system and puts the bits in high-order to low-order, then the 2-bit field would be in bits 27 and 26 of byte 0, and the 17-bit field would be in all bits of bytes 3 and 2 (bits 231 to 216 of the four-byte little-endian storage unit) and bit 27 of byte 1 (bit 215 of the storage unit). So there would be no overlap between these two union members.
Is there any real usage for a union of bit fields? I cannot think of any situation where this would make sense.
Sure, I might have some field in a data structure that sometimes needs to store a 17-bit fromitz number and other times needs to store a 13-bit gizmo number. Unions were originally for storing one thing or another, not for reinterpreting bits of one type as another type.
CodePudding user response:
I was curious so I coded this up. Certainly this was a bug waiting to happen.
#include <stdio.h>
#include <string.h>
union
{
unsigned longitude : 2; //!< status of the longitude (offset: 0)
unsigned latitude : 2; //!< status of the latitude (offset: 2)
unsigned xPosition : 2; //!< status of the x position relative to starting position (offset: 4)
unsigned yPosition : 2; //!< status of the y position relative to starting position (offset: 6)
// ... Many more 2-bit fields... Total 26 fields
unsigned reserved : 12;
unsigned long status[2]; //!< raw status data
} status;
int main( int argc, char **argv) {
memset( &status, 0x00, sizeof(status) );
status.longitude = 0x02;
printf("Union Size = %lu\n", sizeof(status));
printf("status.longitude = %x\n", status.longitude );
printf("status.latitude = %x\n", status.latitude );
}
Setting status.longitude = 0x02;
effectively sets all the other variables in the ūnion
to the same value.
Output is:
Union Size = 16
status.longitude = 2
status.latitude = 2