How does C compiler know the end of an array?-CodePudding

I've read several answers for this question but can't fully understand. How does the compiler know the end of an array. If we suppose that an array of 4 int is located in memory just before an another int, can we by mistake write array[4] and it will give us this 5th int? I suppose no so how the compilers knows there are only 4 elements?

CodePudding user response：

If you're lucky, the compiler might spot that you're writing beyond the end of the array, and flag it as a warning, but it's not actually a compile-time error.

If you have this code:

static int a[4];
static int b;
// ...    
a[4] = 42;

You'd actually discover that b now has the value 42, unless the compiler decided to put it somewhere else.

Yes, it's that easy to overrun an array in C. There are no guard rails.

In fact, this behaviour is explicitly relied upon in some places, although it's not recommended any more. You might declare a struct as follows:

struct comms_block {
    enum comms_block_type block_type;
    size_t len;
    uint8_t variable_data[1];
};

And then, when you wanted to create a comms block of type t, with variable data length len, you would use a function like this:

struct comms_block *new_comms_block(enum_comms_block_type t, size_t len) 
{
    struct comms_block *b = malloc(sizeof(*b)   len - 1);
    b->block_type = t;
    b->len = len;
    return b;
}

The function returns a struct comms_block with len bytes of space from variable_data[0] onwards.

You can safely index variable_data[] using any value up to (len - 1) despite that it's only declared as a single-byte array in the struct definition.

CodePudding user response：

In the context where the array is defined, the bounds are specified and the compiler knows the length. A sizeof is possible.

In the contexts where the array is passed as an argument, only the starting address is given and the compiler does not know the length at all.

This is a terrible source of weird bugs by buffer overflow.

In some cases, static analysis could let a compiler warn about obvious buffer overflows, but not always.

CodePudding user response：

Compilers read and interpret the source code (where the array variable is dimensioned to have 4 elements.) Modern compilers (and add-ons) can analyse the source code (as the programmer should) and, through that evaluation, determine if "rules are being broken"...

char a[4]; // set aside 4 bytes (uninitialised)

char a[] = { 'a', 'b', 'c', 'd' }; // set aside 4 bytes (initialised)
// Above is NOT a string!

char a[] = "abc"; // 3   1 bytes initialised
// Above IS a string (null terminated array of chars.

The compiler "sees" this and "knows" how big 'a[]' is.

CodePudding user response：

char a[4]={'w', 'x', 'y', 'z'};//here index of w is 0 and the index of last element in 3 ... // so you may have question like what is stored in a[4] ... It's nothing but '\0' it means null character.. // compiler will understand that it is the end of character array..

Hope you got what you asked