Home > front end >  clang/gcc cannot set global variables to an address constant plus another address constant
clang/gcc cannot set global variables to an address constant plus another address constant

Time:11-25

The program below compiles without errors.

#include <stdio.h>

char addr_a[8];
char addr_b[8];

unsigned long my_addr = (unsigned long)addr_a   8;                          // PASS
// unsigned long my_addr = (unsigned long)addr_a   (unsigned long)addr_b;   // FAIL (error: initializer element is not constant)

int main() {
        printf("%lx\n", my_addr);
        return 0;
}

Interestingly, when I set unsigned long my_addr = (unsigned long)addr_a (unsigned long)addr_b the compiler throws "error: initializer element is not constant."

I know globals can only be initialized with a constant expression. I also know that the types of constant expressions that can be used in an initializer for a global are specified in section 6.6p7 of the C standard:

More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:

  • an arithmetic constant expression,
  • a null pointer constant,
  • an address constant, or
  • an address constant for a complete object type plus or minus an integer constant expression.

Note that an address constant plus an integer constant is allowed, but not an address constant plus another address constant.

Question:

Why does the C standard restrict the ways you can initialize global variables? What is stopping the C standard from accepting unsigned long my_addr = (unsigned long)addr_a (unsigned long)addr_b?

CodePudding user response:

Objects with static storage duration (i.e. globals, plus locals defined as static) can only be initialized with a constant expression.

The types of constant expression that can be used in an initializer for such an object is specified in section 6.6p7 of the C standard:

More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:

  • an arithmetic constant expression,
  • a null pointer constant,
  • an address constant, or
  • an address constant for a complete object type plus or minus an integer constant expression.

Note that an address constant plus an integer constant is allowed, but not an address constant plus another address constant.

Granted this still isn't exactly what you have, as you have address constants casted to integer type. So let's check 6.6p6 which defines an integer constant expression:

An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof or _Alignof operator.

This paragraph doesn't allow for casting an address constant to an integer type as part of an integer constant expression, but apparently this seems to be supported as an extension.

CodePudding user response:

What is stopping the C standard from accepting unsigned long my_addr = (unsigned long)addr_a (unsigned long)addr_b?

The underlying reason is "Because why would anyone want that?" It's not meaningful to add two absolute addresses together; the result isn't the address of anything in particular.

It's thus a sort of chicken-and-egg thing. The language doesn't support it because it's useless, but also because existing linkers and object file formats don't support such a relocation. For instance, for ELF on x86-64, see the psABI Table 4.9 for a list of supported relocations, and note there is no S S. And the linkers don't support it because it's useless, and because the language doesn't require it to be supported.

I guess originally, the tools probably came before the language (the earliest C compilers would have used linkers designed for assembly programs). So the original tools probably didn't support this, the language saw no need to demand that they do so, and over time, neither one ever saw a need to add it.

  • Related