Home > front end >  Handling consecutive delimiters with strsep() in C
Handling consecutive delimiters with strsep() in C

Time:03-28

I am trying to read a string word by word in C using strsep() function, which can be also done using strtok(). When there are consecutive delimiters -in my case the empty space- the function does not ignore them. I am expected to use strsep() and couldn't figure out the solution. I'd appreciate it if one of you can help me.

#include <stdio.h>
#include <string.h>

int main(){

  char newLine[256]= "scalar             i";

  char *q;
  char *token;
  q = strdup(newLine);

  const char delim[] = " ";
    
  token = strsep(&q, delim);
  printf("The token is: \"%s\"\n", token);
    
  token = strsep(&q, delim);    
  printf("The token is: \"%s\"\n", token);

  return 0;
} 

Actual output is:

The token is: "scalar"
The token is: ""

What I expected is:

The token is: "scalar"
The token is: "i"

To do that I also tried to write a while loop so that I could continue until the token is non-empty. But I cannot equate tokens with "", " ", NULL or "\n". Somehow the token is not equal to any of these.

CodePudding user response:

First note that strsep(), while convenient is not in the standard C library, and will only be available on Unix systems with BSD-4.4 C library support. That's most Unix'ish systems today, but still.

Anyway, strsep() supports empty fields. That means that if your string has consecutive delimiters, it will find empty, length-0, tokens between each of these delimiters. For example, the tokens for string "ab cd" will be:

  1. "ab"
  2. ""
  3. "cd"

2 delimiters -> 3 tokens.

Now, you also said:

I cannot equate tokens with "", " ", NULL or "\n". Somehow the token is not equal to any of these.

I am guessing what you were trying to perform is simply comparison, e.g. if (my_token == "") { ... }. That won't work, because that is a comparison of pointers, not of the strings' contents. Two strings may have identical characters at different places in memory, and that is particularly likely with the example I just gave, since my_token will be dynamic, and will not be pointing to the static-storage-duration string "" used in the comparison.

Instead, you will need to use strcmp(my_token,""), or better yet, just check manually for the first char being '\0'.

  •  Tags:  
  • c
  • Related