I have problems understanding how char*
works.
In the example below, the struses is called by main().
I created a buf to store the const variable because I want to make a modifiable copy of s1
, then I just call the sortString()
.
This version makes sense to me as I understand that char[]
can be modified:
#include "common.h"
#include <stdbool.h>
void sortString(char string[50]);
bool struses(const char *s1, const char *s2)
{
char buf[50];
strcpy(buf, s1); // <===== input = "perpetuity";
sortString(buf);
printf("%s\n", buf); // prints "eeipprttuy"
return true;
}
void sortString(char string[50])
{
char temp;
int n = strlen(string);
for (int i = 0; i < n - 1; i )
{
for (int j = i 1; j < n; j )
{
if (string[i] > string[j])
{
temp = string[i];
string[i] = string[j];
string[j] = temp;
}
}
}
}
However, in this version I deliberately changed the type to char*
which is supposed to be read-only. Why do I still get the same result?
#include "common.h"
#include <stdbool.h>
void sortString(char *string);
bool struses(const char *s1, const char *s2)
{
char buf[50];
strcpy(buf, s1);
sortString(buf);
printf("%s\n", buf);
return true;
}
void sortString(char *string) // <==== changed the type
{
char temp;
int n = strlen(string);
for (int i = 0; i < n - 1; i )
{
for (int j = i 1; j < n; j )
{
if (string[i] > string[j])
{
temp = string[i];
string[i] = string[j];
string[j] = temp;
}
}
}
}
This is why I think char *
is read only. I get a bus error after trying to to modify read[0]
:
char * read = "Hello";
read[0]='B';// <=== Bus error
printf("%s\n", read);
**Update: somehow the complier really does not throw Bus error ** I guess as the behaviour is undefined, then the result is also not predictable ?
CodePudding user response:
The compiler adjusts the type of the parameter having an array type of this function declaration
void sortString(char string[50]);
to pointer to the element type
void sortString(char *string);
So for example these function declarations are equivalent and declare the same one function
void sortString(char string[100]);
void sortString(char string[50]);
void sortString(char string[]);
void sortString(char *string);
Within this function
void sortString(char *string)
there is used the character array buf
that stores the copy of the passed array (or of the passed string literal through a pointer to it)
char buf[50];
strcpy(buf, s1);
sortString(buf);
So there is no problem. s1
can be a pointer to a string literal. But the content of the string literal is copied in the character array buf
that is being changed
As for this code snippet
char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== still prints "Hello"
then it has undefined behavior because you may not change a string literal.
From the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.
Pay attention to that in C opposite to C string literals have types of constant character arrays. It is advisable also in C to declare pointers to string literals with the qualifier const
to avoid undefined behavior as for example
const char * read = "Hello";
By the way the function sortString
has redundant swappings of elements in the passed string. It is better to declare and define it the following way
// Selection sort
char * sortString( char *s )
{
for ( size_t i = 0, n = strlen( s ); i != n; i )
{
size_t min_i = i;
for ( size_t j = i 1; j != n; j )
{
if ( s[j] < s[min_i] )
{
min_i = j;
}
}
if ( i != min_i )
{
char c = s[i];
s[i] = s[min_i];
s[min_i] = c;
}
}
return s;
}
CodePudding user response:
char *
does not mean read-only. char *
simply means pointer to char
.
You have likely been taught that string literals, such as "Hello"
, may not be modified. That is not quite true; a correct statement is that the C standard does not define what happens when you attempt to modify a string literal, and C implementations commonly place string literals in read-only memory.
We can define objects with the const
qualifier to say we intend not to modify them and to allow the compiler to place them in read-only memory (although it is not obligated to). If we were defining the C language from scratch, we would specify that string literals are const
-qualified, the pointers that come from string literals would be const char *
.
However, when C was first developed, there was no const
, and string literals produced pointers that were just char *
. The const
qualifier came later, and it is too late the change string literals to be const
-qualified because of all the old code using char *
.
Because of this, it is possible that a char *
points to characters in a string literal that should not be modified (because the behavior is not defined). But char *
in general does not mean read-only.
CodePudding user response:
Your premise that the area pointed by a char*
isn't modifiable is false. This is perfectly line:
char s[] = "abc"; // Same as: char s[4] = { 'a', 'b', 'c', 0 };
char *p = s; // Same as: char *p = &(s[0]);
*p = 'A';
printf("%s\n", p); // Abc
The reason you had a fault is because you tried to modify the string created by a string literal. This is undefined behaviour:
char *p = "abc";
*p = 'A'; // Undefined behaviour
printf("%s\n", p);
One would normally use a const char *
for such strings.
const char *p = "abc";
*p = 'A'; // Compilation error.
printf("%s\n", p);
CodePudding user response:
Regarding
char * read = "Hello";
read[0]='B';
printf("%s\n", read); // still prints "Hello"
you have tripped over a backward compatibility wart in the C specification.
String constants are read-only. char *
, however, is a pointer to modifiable data.
The type of a string constant ought to be const char [N]
where N is the number of char
s given by the contents of the constant, plus one. However, const
did not exist in the original C language (prior to C89). So there was, in 1989, a whole lot of code that used char *
to point to string constants. So the C committee made the type of string constants be char [N]
, even though they are read-only, to keep that code working.
Writing through a char *
that points to a string constant triggers undefined behavior; anything can happen. I would have expected a crash, but the write getting discarded is not terribly surprising either.
In C the type of string constants is in fact const char [N]
and the above fragment would have failed to compile. Some C compilers have an optional mode you can turn on that changes the type of string constants to const char [N]
; for instance, GCC and clang have the -Wwrite-strings
command line option. Using this mode for new programs is a good idea.
CodePudding user response:
Yout long examples can be reduced to your last question.
This is why I think char * is read only, get bus error after attempt to modify read[0]
char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== Bus error
"Hello"
is a string literal . Attempt to modify the string literal manifested itself by the Bus Error.
Your pointer is referencing the memory which should not be modified.
How to sort it out? You need to define pointer referencing the modifiable object
char * read = (char []){"Hello"};
read[0]='B';
printf("%s\n", read);
So as you see declaring it as modifiable is not making it modifiable.