I'm trying to switch from python to c for sometime, I was just checking out a few functions, what caught my attention is sizeof operator which returns the size of object in bytes. I created an array of strings, and would want to find the size of the array. I know that it can be done by sizeof(array)/sizeof(array[0])
. However, I find this a bit confusing.
I expect that large array would be 2D (which is just 1D array represented differently) and each character array within this large array would occupy as many bytes as the maximum size of character array within this large array. Example below
#include <stdio.h>
#include <string.h>
const char *words[] = {"this","that","Indian","he","she","sometimes","watch","now","browser","whatsapp","google","telegram","cp","python","cpp","vim","emacs","jupyter","space","earphones","laptop","charger","whiteboard","chalk","marker","matrix","theory","optimization","gradient","descent","numpy","sklearn","pandas","torch","array"};
const int length = sizeof(words)/sizeof(words[0]);
int main()
{
printf("%s",words[1]);
printf("%i",length);
printf("\n%lu",sizeof(words[0]));
printf("\n%lu %lu %s",sizeof(words[27]),strlen(words[27]),words[27]);
return 0;
}
[OUT]
that35
8
8 12 optimization
each of the character arrays occupy 8 bytes, including the character array "optimization". I don't understand what is going on here, the strlen
function gives expected output since it just find NULL character in the character array, I'd expected the output of sizeof operator to be 1 more than the output of strlen.
PS: I didn't find some resource that addresses this issue.
CodePudding user response:
It's happening because sizeof(words[27])
is giving the size of a pointer and words[27]
is a pointer, and pointers have a fixed size of each machine, mostly 8 bytes
on a x86_64
architecture CPU. Also, words
is an array of pointers.
each of the character arrays occupy 8 bytes, including the character array "optimization".
No, each word in words
is occupying a fixed memory (their length), 8 bytes
is the size of pointer which is unsigned long int
, it stores the address of the word in words
.
const int length = sizeof(words)/sizeof(words[0]);
The above line gives 35
because words
is not decayed as a pointer, it is stored in the program's data section, because it's a global variable.
Read More about pointer decaying:
-
In practice, the
words
will probably point to multiple entries from read-only-data. To usewords
in this manner, it is totally appropriate to usestrlen
.