I am trying to read a string word by word in C using strsep() function, which can be also done using strtok(). When there are consecutive delimiters -in my case the empty space- the function does not ignore them. I am expected to use strsep() and couldn't figure out the solution. I'd appreciate it if one of you can help me.
#include <stdio.h>
#include <string.h>
int main(){
char newLine[256]= "scalar i";
char *q;
char *token;
q = strdup(newLine);
const char delim[] = " ";
token = strsep(&q, delim);
printf("The token is: \"%s\"\n", token);
token = strsep(&q, delim);
printf("The token is: \"%s\"\n", token);
return 0;
}
Actual output is:
The token is: "scalar"
The token is: ""
What I expected is:
The token is: "scalar"
The token is: "i"
To do that I also tried to write a while loop so that I could continue until the token is non-empty. But I cannot equate tokens with "", " ", NULL or "\n". Somehow the token is not equal to any of these.
CodePudding user response:
First note that strsep()
, while convenient is not in the standard C library, and will only be available on Unix systems with BSD-4.4 C library support. That's most Unix'ish systems today, but still.
Anyway, strsep()
supports empty fields. That means that if your string has consecutive delimiters, it will find empty, length-0, tokens between each of these delimiters. For example, the tokens for string "ab cd"
will be:
"ab"
""
"cd"
2 delimiters -> 3 tokens.
Now, you also said:
I cannot equate tokens with "", " ", NULL or "\n". Somehow the token is not equal to any of these.
I am guessing what you were trying to perform is simply comparison, e.g. if (my_token == "") { ... }
. That won't work, because that is a comparison of pointers, not of the strings' contents. Two strings may have identical characters at different places in memory, and that is particularly likely with the example I just gave, since my_token
will be dynamic, and will not be pointing to the static-storage-duration string ""
used in the comparison.
Instead, you will need to use strcmp(my_token,"")
, or better yet, just check manually for the first char being '\0'
.