Home > Mobile >  Why am I able to modify char * in this example?
Why am I able to modify char * in this example?

Time:11-23

I have problems understanding how char* works.

In the example below, the struses is called by main(). I created a buf to store the const variable because I want to make a modifiable copy of s1, then I just call the sortString().

This version makes sense to me as I understand that char[] can be modified:

#include "common.h"
#include <stdbool.h>
void sortString(char string[50]);

bool struses(const char *s1, const char *s2) 
{

    char buf[50];
    strcpy(buf, s1);  // <===== input = "perpetuity";
    sortString(buf);
    printf("%s\n", buf); // prints "eeipprttuy"
    return true;
}

void sortString(char string[50]) 
{
    char temp;
    int n = strlen(string);
    for (int i = 0; i < n - 1; i  )
    {
        for (int j = i   1; j < n; j  )
        {
            if (string[i] > string[j])
            {
                temp = string[i];
                string[i] = string[j];
                string[j] = temp;
            }
        }
    }
}

However, in this version I deliberately changed the type to char* which is supposed to be read-only. Why do I still get the same result?

#include "common.h"
#include <stdbool.h>
void sortString(char *string);

bool struses(const char *s1, const char *s2)
{

    char buf[50];
    strcpy(buf, s1); 
    sortString(buf);
    printf("%s\n", buf);
    return true;
}

void sortString(char *string)  // <==== changed the type
{
    char temp;
    int n = strlen(string);
    for (int i = 0; i < n - 1; i  )
    {
        for (int j = i   1; j < n; j  )
        {
            if (string[i] > string[j])
            {
                temp = string[i];
                string[i] = string[j];
                string[j] = temp;
            }
        }
    }
}

This is why I think char * is read only. I get a bus error after trying to to modify read[0]:

char * read = "Hello";
read[0]='B';// <=== Bus error
printf("%s\n", read); 

**Update: somehow the complier really does not throw Bus error ** I guess as the behaviour is undefined, then the result is also not predictable ?

I can print this without bus error

enter image description here

CodePudding user response:

The compiler adjusts the type of the parameter having an array type of this function declaration

void sortString(char string[50]);

to pointer to the element type

void sortString(char *string);

So for example these function declarations are equivalent and declare the same one function

void sortString(char string[100]);
void sortString(char string[50]);
void sortString(char string[]);
void sortString(char *string);

Within this function

void sortString(char *string)

there is used the character array buf that stores the copy of the passed array (or of the passed string literal through a pointer to it)

char buf[50];
strcpy(buf, s1);
sortString(buf);

So there is no problem. s1 can be a pointer to a string literal. But the content of the string literal is copied in the character array buf that is being changed

As for this code snippet

char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== still prints "Hello"

then it has undefined behavior because you may not change a string literal.

From the C Standard (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Pay attention to that in C opposite to C string literals have types of constant character arrays. It is advisable also in C to declare pointers to string literals with the qualifier const to avoid undefined behavior as for example

const char * read = "Hello";

By the way the function sortString has redundant swappings of elements in the passed string. It is better to declare and define it the following way

// Selection sort
char * sortString( char *s ) 
{
    for ( size_t i = 0, n = strlen( s ); i != n; i   )
    {
        size_t min_i = i;

        for ( size_t j = i   1; j != n; j   )
        {
            if ( s[j] < s[min_i] )
            {
                min_i = j;
            }
        }

        if ( i != min_i )
        {
            char c = s[i];
            s[i] = s[min_i];
            s[min_i] = c;
        }
    }

    return s;
}

CodePudding user response:

char * does not mean read-only. char * simply means pointer to char.

You have likely been taught that string literals, such as "Hello", may not be modified. That is not quite true; a correct statement is that the C standard does not define what happens when you attempt to modify a string literal, and C implementations commonly place string literals in read-only memory.

We can define objects with the const qualifier to say we intend not to modify them and to allow the compiler to place them in read-only memory (although it is not obligated to). If we were defining the C language from scratch, we would specify that string literals are const-qualified, the pointers that come from string literals would be const char *.

However, when C was first developed, there was no const, and string literals produced pointers that were just char *. The const qualifier came later, and it is too late the change string literals to be const-qualified because of all the old code using char *.

Because of this, it is possible that a char * points to characters in a string literal that should not be modified (because the behavior is not defined). But char * in general does not mean read-only.

CodePudding user response:

Your premise that the area pointed by a char* isn't modifiable is false. This is perfectly line:

char s[] = "abc";       // Same as: char s[4] = { 'a', 'b', 'c', 0 };
char *p = s;            // Same as: char *p = &(s[0]);
*p = 'A';
printf("%s\n", p);      // Abc

Demo

The reason you had a fault is because you tried to modify the string created by a string literal. This is undefined behaviour:

char *p = "abc";
*p = 'A';               // Undefined behaviour
printf("%s\n", p);

One would normally use a const char * for such strings.

const char *p = "abc";
*p = 'A';               // Compilation error.
printf("%s\n", p);

Demo

CodePudding user response:

Regarding

char * read = "Hello";
read[0]='B';
printf("%s\n", read);   // still prints "Hello"

you have tripped over a backward compatibility wart in the C specification.

String constants are read-only. char *, however, is a pointer to modifiable data. The type of a string constant ought to be const char [N] where N is the number of chars given by the contents of the constant, plus one. However, const did not exist in the original C language (prior to C89). So there was, in 1989, a whole lot of code that used char * to point to string constants. So the C committee made the type of string constants be char [N], even though they are read-only, to keep that code working.

Writing through a char * that points to a string constant triggers undefined behavior; anything can happen. I would have expected a crash, but the write getting discarded is not terribly surprising either.

In C the type of string constants is in fact const char [N] and the above fragment would have failed to compile. Some C compilers have an optional mode you can turn on that changes the type of string constants to const char [N]; for instance, GCC and clang have the -Wwrite-strings command line option. Using this mode for new programs is a good idea.

CodePudding user response:

Yout long examples can be reduced to your last question.

This is why I think char * is read only, get bus error after attempt to modify read[0]

char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== Bus error

"Hello" is a string literal . Attempt to modify the string literal manifested itself by the Bus Error.

Your pointer is referencing the memory which should not be modified.

How to sort it out? You need to define pointer referencing the modifiable object

char * read = (char []){"Hello"};
read[0]='B';
printf("%s\n", read); 

So as you see declaring it as modifiable is not making it modifiable.

  • Related