In the code char * str = "hello";
, I understand that code "hello"
is to allocate the word hello to any other memory and then put the first value of that allocated memory into the variable str.
But when I use the code char str[10] = "hello";
, I understood that the word hello is included in each element of the array.
If then, on the top, the code "hello"
returns the address of the memory
and on the bottom, the code "hello"
returns the word h e l l o \n.
I want to know why they are different and if I'm wrong, I want to know what double quotes return.
CodePudding user response:
C is a bit quirky. You have two distinct use cases here. But let's first start with what "hello"
is.
Your "hello"
in the program source code is a character sequence. When the compiler is compiling this source code, it appends a zero byte to the sequence, so that standard library functions like strlen()
can work on it. The resulting zero-terminated sequence is then used by the compiler to "initialize an array of static storage duration and length just sufficient to contain the sequence array of constant characters" (n1570 ISO C draft, 6.4.5/6). That length is 6: The 5 characters h, e, l, l and o as well as the appended zero byte.
"Static storage duration" means that the array exists the entire time the program is running (as opposed to objects with automatic local storage duration, e.g. local variables, and those with dynamic storage duration, which are created via malloc()
or calloc()
).
You can memorize the address of that array, as in char *str = "hello";
. This address will point to valid memory during the lifetime of the program.
The second use case is a special syntax for initializing character arrays. It is just syntactic sugar for this common use case, and a deviation from the fact that you cannot normally initialize arrays with arrays.1
This time you don't define a pointer, you define a proper array of 10 chars. You then use the string literal to initialize it. You always can use the generic method to initialize a character array by listing the individual array elements, separated by commas, in curly braces (by the way, this generic method works also for the other kind of compound types, namely structs):
char str[10] = { 'h', 'e', 'l', 'l', 'o', '\0' };
This is entirely equivalent to
char str[10] = "hello";
Now your array has more elements (10) than the number of characters in the initializing array produced from the string literal (6); the standard stipulates that "subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration". Those global and static variables are initialized with zero, which means that the character array str
ends with 4 zero characters.
It is immediately obvious why Dennis Ritchie added the somewhat anti-paradigmatic initialization of character arrays via a string literal, probably after the second time he had to do it with the generic array initialization syntax. Designing your own language has its benefits.
1 For example, doesn't work. You have to use static char src[] = "123"; char dest[] = src;
strcpy()
.
CodePudding user response:
The initialization:
char * str = "hello";
in most C implementations makes sure that the string hello
is placed in a constant data section of the executable memory. Exactly six bytes are written, the last one being the string terminator '\0'
.
str
char pointer contains the address of the first character 'h'
, so that anyone accessing the string knows that the following bytes have to be read until the terminator character is found.
The other initialization
char str[10] = "hello"; // <-- string must be enclosed in double quotes
is very similar, as str
points to the first character of the string and that the following characters are written in the following memory locations (included the string terminator).
But:
- Even if only six bytes are explicitly initialized, ten bytes are allocated because that's the size of the array. In this case, the four trailing bytes will contain zeroes
- Data is not constant and can be changed, while in the previous example it wasn't possible because such initialization, in most C implementations, instructs the compiler to use a constant data section
CodePudding user response:
You seem to be mixing up some things:
char str[10] = "hello';
This does not even compile: when you start with a double-quote, you should end with one:
char str[10] = "hello";
In memory, this has following effect:
str[0] : h
str[1] : e
str[2] : l
str[3] : l
str[4] : o
str[5] : 0 (the zero character constant)
str[6] : xxx
str[7] : xxx
str[8] : xxx
str[9] : xxx
(By xxx
, I mean that this can be anything)
As a result, the code will not return hello\n
(with an end-of-line character), just hello\0
(the zero character).
The double quotes just mention the beginning and the ending of a string constant and return nothing.