Home > Enterprise >  What is this weird output after using pointer Arithemtic in C?
What is this weird output after using pointer Arithemtic in C?

Time:03-26

My goal in the code is to parse some sort of input into words regarding all spaces but at the same time use those spaces to signify a change in words. The logic here is that anytime it encounters a space it loops until there is no longer a space character and then when it encounters a word it loops until it encounters a space character or a '\0' and meanwhile puts each character into one index of an array inside arrays in the 2d array. Then before the while loop continues again it indexes to the next array.

I'm almost certain the logic is implemented well enough for it to work but I get this weird output listed below I've had the same problem before when messing with pointers and whatnot but I just can't get this to work no matter what I do. Any ideas as to why I'm genuinely curious about the reason behind why?

#include <stdio.h>
#include <stdlib.h>


void print_mat(char **arry, int y, int x){
  for(int i=0;i<y;i  ){
    for(int j=0;j<x;j  ){  
      printf("%c",arry[i][j]);
    }
    printf("\n");
    }
}

char **parse(char *str)
{
char **parsed=(char**)malloc(sizeof(10*sizeof(char*)));
        for(int i=0;i<10;i  ){
                parsed[i]=(char*)malloc(200*sizeof(char));
                }

        char **pointer = parsed;
        while(*str!='\0'){
                if(*str==32)
                {
                        while(*str==32 && *str!='\0'){
                                str  ;
                        }
                }
                  while(*str!=32 && *str!='\0'){
                    (*pointer) = (str);
                    (*pointer)  ;
                    str  ;
                  }
        pointer  ;
        }
        return parsed;
}

int main(){
  char str[] = "command -par1 -par2 thething";
  char**point=parse(str);
  print_mat(point,10,200);
  return 0;
}

 -par1 -par2 thethingUP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�

 -par2 thethingUP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�

 thethingUP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�

UP%�W���U�6o� X%��U�v;,���UP%���cNjW��]A�aW�Ӹto�8so�z�


I also tried to simply index the 2d array but to no avail

char **parse(char *str)
{
        int i, j;
        i=0;
        j=0;
char **parsed=(char**)malloc(sizeof(10*sizeof(char*)));
        for(int i=0;i<10;i  ){
                parsed[i]=(char*)malloc(200*sizeof(char));
                }

        while(*str!='\0'){
          i=0;
                if(*str==32)
                {
                        while(*str==32 && *str!='\0'){
                                str  ;
                        }
                }
                  while(*str!=32 && *str!='\0'){
                    parsed[j][i] = (*str);
                    i  ;
                    str  ;
                  }
        j  ;
        }
        return parsed;
}

Output:

command�&�v�U`'�v�U0(�v�U)�v�U�)�v�U
-par1
-par2
thething
makefile:5: recipe for target 'build' failed
make: *** [build] Segmentation fault (core dumped)

CodePudding user response:

A couple of problems in your code:

  • Your program is leaking memory.
  • Your program is accessing memory which it does not own and this is UB.

Lets discuss them one by one -

First problem - Memory leak:

Check this part of parse() function:

                  while(*str!=32 && *str!='\0'){
                    (*pointer) = (str);

In the first iteration of outer while loop, the *pointer will give you first member of parsed array i.e. parsed[0], which is a pointer to char. Note that you are dynamically allocating memory to parsed[0], parsed[1]... parsed[9] pointers in parse() before the outer while loop. In the inner while loop you are pointing them to str. Hence, they will loose the dynamically allocated memory reference and leading to memory leak.

Second problem - Accessing memory which it does not own:

As stated above that the pointers parsed[0], parsed[1] etc. will point to whatever was the current value of str in the inner while loop of parse() function. That means, the pointers parsed[0], parsed[1] etc. will point to some element of array str (defined in main()). In the print_mat() function, you are passing 200 and accessing every pointer of array arry from 0 to 199 index. Since, the arry pointers are pointing to str array whose size is 29, that means, your program is accessing memory (array) beyond its size which is UB.

Lets fix these problem in your code without making much of changes:

For memory leak:

Instead of pointing the pointers to str, assign characters of str to the allocated memory, like this:

                  int i = 0;
                  while(*str!=32 && *str!='\0'){
                     (*pointer)[i  ] = (*str);
                     str  ;
                  }

For accessing memory which it does not own:

A point that you should remember:
In C, strings are actually one-dimensional array of characters terminated by a null character \0.

First of all, empty the strings after dynamically allocating memory to them so that you can identify the unused pointers while printing them:

        for(int i=0;i<10;i  ){
            parsed[i]=(char*)malloc(200*sizeof(char));
            parsed[i][0] = '\0';
        }

Terminate all string in with null terminator character after writing word to parsed array pointers:

                  int i = 0;
                  while(*str!=32 && *str!='\0'){
                     (*pointer)[i  ] = (*str);
                     str  ;
                  }
                  // Add null terminator
                  (*pointer)[i] = '\0'; 

In the print_mat(), make sure once you hit the null terminator character, don't read beyond it. Modify the condition of inner for loop:

    for(int j = 0; (j < x) && (arry[i][j] != '\0'); j  ){
       printf("%c",arry[i][j]);

You don't need to print the strings character by character, you can simply use %s format specifier to print a string, like this -

    for (int i = 0;i < y; i  ) {
        if (arry[i][0] != '\0') {
            printf ("%s\n", arry[i]);
        }
    }

With the above suggested changes (which are the minimal changes required in your program to work it properly), your code will look like this:

#include <stdio.h>
#include <stdlib.h>

void print_mat (char **arry, int y) {
    for (int i = 0; i < y; i  ) {
        if (arry[i][0] != '\0') {
            printf ("%s\n", arry[i]);
        }
    }
}

char **parse(char *str) {
    char **parsed = (char**)malloc(sizeof(10*sizeof(char*)));
    // check malloc return

    for(int i = 0; i < 10; i  ){
        parsed[i] = (char*)malloc(200*sizeof(char));
        // check malloc return
        parsed[i][0] = '\0';
    }

    char **pointer = parsed;
    while (*str != '\0') {
        if(*str == 32) {
            while(*str==32 && *str!='\0') {
                str  ;
            }
        }

        int i = 0;
        while (*str != 32 && *str != '\0') {
            (*pointer)[i  ] = (*str);
            str  ;
        }

        (*pointer)[i] = '\0';
        pointer  ;
    }
    return parsed;
}

int main (void) {
    char str[] = "command -par1 -par2 thething";

    char **point = parse(str);
    print_mat (point, 10);

    // free the dynamically allocate memory

    return 0;
}

Output:

command
-par1
-par2
thething

There is a lot improvements can be done in your code implementation, for e.g. -

  • As I have shown above, you can use %s format specifier instead of printing string character by character etc.. I am leaving it up to you to identify those changes and modify your program.
  • Allocate memory to a parsed array pointer only where there is a word in str.
  • Instead of allocating memory of fixed size (i.e. 200) to parsed array pointers, allocate memory of size of word only.

Few suggestions:

  • Always check the return value of function like malloc.
  • Make sure to free the dynamically allocated memory once your program done with it.

CodePudding user response:

You can achieve what you want in a simpler way.

First, define a function that checks if a character (separator) is present in a list of characters (separators):

// Returns true if c is found in a list of separators, false otherwise.
bool belongs(const char c, const char *list)
{
    for (const char *p = list; *p;   p)
        if (*p == c) return true;
    
    return false;
}

Then, define a function that splits a given string into tokens, separated by one or more separators:

// Splits a string into into tokens, separated by one of the separators in sep
bool split(const char *s, const char *sep, char **tokens, size_t *ntokens, const size_t maxtokens)
{
    // Start with zero tokens.
    *ntokens = 0;
    
    const char *start = s, *end = s;
    for (const char *p = s; /*no condtition*/;   p) {
        
        // Can no longer hold more tokens? Exit.
        if (*ntokens == maxtokens)
            return false;
        
        // Not a token? Continue looping.
        if (*p && !belongs(*p, sep))
            continue;
        
        // Found a token: calculate its length.
        size_t tlength = p - start;
        
        // Empty token?
        if (tlength == 0) {
            // And reached the end of string? Break.
            if (!*p) break;
            
            // Not the end of string? Skip it.
              start;
            continue;
        }
        
        // Attempt to allocate memory.
        char *token = malloc(sizeof(*token) * (tlength   1));
        
        // Failed? Exit.
        if (!token)
            return false;
        
        // Copy the token.
        strncpy(token, start, tlength 1);
        token[tlength] = '\0';
        
        // Put it in tokens array.
        tokens[*ntokens] = token;
        
        // Update the number of tokens.
        *ntokens  = 1;
        
        // Reached the end of string? Break.
        if (!*p) break;
        
        // There is more to parse. Set the start to the next char.
        start = p   1;
    }
    
    return true;
}

Call it like this:

int main(void)
{
    char command[] = "command -par1 -par2 thing";
    
    const size_t maxtokens = 10;
    
    char **tokens = malloc(sizeof *tokens * maxtokens);
    if (!tokens) return 1;
    
    size_t ntokens = 0;
    
    split(command, " ", tokens, &ntokens, maxtokens);
    
    // Print all tokens.
    printf("Number of tokens = %ld\n", ntokens);
    for (size_t i = 0; i < ntokens;   i)
        printf("%s\n", tokens[i]);
    
    // Release memory when done.
    for (size_t i = 0; i < ntokens;   i)
        free(tokens[i]);
    
    free(tokens);
}

Output:

Number of tokens = 4
command
-par1
-par2
thing
  • Related