Home > front end >  How to use strtok on char*
How to use strtok on char*

Time:04-24

In c , to filter out the delimiter using strtok, the source has to be a char array, otherwise, it gives me a seg fault. How can I use strtok on a pointer to char?

Code example of how to structure strtok:

#include <stdio.h>
#include <string.h>

int main () {
  char str[] ="- This, a sample string."; // this is the string i want to split. notice how it's an array
  char * pch;
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}

Example of what I want to do:

/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
  char* str ="- This, a sample string."; // since this is a pointer to char, it gives a segmentation fault after compiling, and executing.
  char * pch;
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}

CodePudding user response:

Example of what I want to do:

char* str ="- This, a sample string."

You cannot do what you want because string literals are not implicitly convertible to pointer to non-const char in C . Furthermore, strtok modifies the argument string, and srtring literals must not be modified in C .

How to use strtok on c on char*

If you really want to, you can do this:

char str_arr[] ="- This, a sample string.";
char* str = str_arr;

But it would be rather pointless.


In order to tokenise a string literal without copying it into a modifiable array, you must not use strtok.

CodePudding user response:

You are trying to modify a string literal (the function strtok changes the source string inserting null characters '\0')

char* str ="- This, a sample string.";

First of all in C opposite to C string literals have types of constant character arrays. So you have to write the declaration of the pointer in a C program with the qualifier const.

const char* str ="- This, a sample string.";

Any attempt to change a string literal in C and C results in undefined behavior.

For example in the C Standard there is written (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

So it is always better also in C to declare pointers to string literals with the qualifier const.

Instead of strtok you could use for example C standard string function strspn and strcspn.

Here is a demonstration program.

#include <iostream>
#include <iomanip>
#include <string_view>
#include <cstring>

int main()
{
    const char *s = "- This, a sample string.";
    const char *delim = " ., -";

    for (const char *p = s; *( p  = strspn( p, delim ) ) != '\0'; )
    {
        auto n = strcspn( p, delim );

        std::string_view sv( p, n );

        std::cout << std::quoted( sv ) << ' ';

        p  = n;
    }

    std::cout << '\n';
}

The program output is

"This" "a" "sample" "string"

You could for example declare a vector of string views like std::vector<std::string_view> and store in it each substring.

For example

#include <iostream>
#include <iomanip>
#include <string_view>
#include <vector>
#include <cstring>

int main()
{
    const char *s = "- This, a sample string.";
    const char *delim = " ., -";

    std::vector<std::string_view> v;

    for (const char *p = s; *( p  = strspn( p, delim ) ) != '\0'; )
    {
        auto n = strcspn( p, delim );

        v.emplace_back( p, n );

        p  = n;
    }

    for (auto sv : v)
    {
        std::cout << std::quoted( sv ) << ' ';
    }
    std::cout << '\n';
}

The program output is the same as shown above.

Or if the compiler does not support C 17 then instead of a vector of the type std::vector<std::string_view> you can use a vector of the type std::vector<std::pair<const char *, size_t>>.

For example

#include <iostream>
#include <iomanip>
#include <utility>
#include <vector>
#include <cstring>

int main()
{
    const char *s = "- This, a sample string.";
    const char *delim = " ., -";

    std::vector<std::pair<const char *, size_t>> v;

    for (const char *p = s; *( p  = strspn( p, delim ) ) != '\0'; )
    {
        auto n = strcspn( p, delim );

        v.emplace_back( p, n );

        p  = n;
    }

    for (auto p : v)
    {
        std::cout.write( p.first, p.second ) << ' ';
    }
    std::cout << '\n';
}

The program output is

This a sample string

In C you can use a variable length array or a dynamically allocated array with the element type of a structure type that contains two data members of the type const char * and size_t similarly to the C class std::pair. But To define the array you at first need to calculate how many words there are in the string literal using the same for loop.

Here is a C demonstration program.

#include <stdio.h>
#include <string.h>

int main( void )
{
    const char *s = "- This, a sample string.";
    const char *delim = " ., -";

    size_t nmemb = 0;

    for (const char *p = s; *( p  = strspn( p, delim ) ) != '\0'; )
    {
          nmemb;
        size_t n = strcspn( p, delim );
        p  = n;
    }    

    struct SubString
    {
        const char *pos;
        size_t size;
    } a[nmemb];

    size_t i = 0;

    for (const char *p = s; *( p  = strspn( p, delim ) ) != '\0'; )
    {
        size_t n = strcspn( p, delim );

        a[i].pos = p;
        a[i].size =n;
          i;
        p  = n;
    }

    for ( i = 0; i < nmemb; i   )
    {
        printf( "%.*s ", ( int )a[i].size, a[i].pos );
    } 

    putchar( '\n' );   
}

The program output is

This a sample string
  • Related