Home > Mobile >  trying to break a line into multiple tokens
trying to break a line into multiple tokens

Time:12-21

My problem is this, i have this string RGM 3 13 GName 0005 32 funny 0000 44 teste 0000\n and i want to split it like this

13 GName
32 funny 
44 teste

so i can save the numbers and names in an array, but the problem is for some reason declaring an "" like i did is invalid in c and it is breaking the line at all.

Program:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#define line "RGM 3 13 GName 0005 32 funny 0000 44 teste 0000\n"

int main()
{
    char s[] = "";
    char* token;
    strtok(line,s);
    strtok(line,s);
    while( token != NULL )
    {
        printf( " %s\n", token );
        token = strtok(NULL,s);
    }
   
   return(0);
}

CodePudding user response:

If your lines are always going to have the format A B n1 str1 code1 ... nn strn coden, where you seem to discard A B, the simple C code below will suffice:

[Demo]

#include <iostream>  // cout
#include <sstream>  // istringstream
#include <string>

int main()
{
    const std::string line{"RGM 3 13 GName 0005 32 funny 0000 44 teste 0000"};

    std::istringstream iss{line};

    std::string token1{};
    std::string token2{};
    iss >> token1 >> token2; // get rid of header (RGM 3)

    std::string token3{};
    while (iss >> token1 >> token2 >> token3)
    {
        std::cout << token1 << " " << token2 << "\n";
    }
}

Notice this is doing almost none checks at all. Should you need more control over your input, something more advanced could be implemented.

For example, the code below, using regular expressions, would try to match each line header to a RGM m (RGM text and 1-digit m); then, it would search for groups of the form n str code (2-digit n, alphabetic str, 4-digit code):

[Demo]

#include <iostream>  // cout
#include <regex>  // regex_search, smatch
#include <string>

int main()
{
    std::string line{"RGM 3 13 GName 0005 32 funny 0000 44 teste 0000"};

    std::regex header_pattern{R"(RGM\s \d)"};
    std::regex group_pattern{R"(\s (\d{2})\s ([a-zA-Z] )\s \d{4})"};
    std::smatch matches{};
    if (std::regex_search(line, matches, header_pattern))
    {
        line = matches.suffix();
        while (std::regex_search(line, matches, group_pattern))
        {
            std::cout << matches[1] << " " << matches[2] << "\n";
            line = matches.suffix();
        }
    }
}

CodePudding user response:

strtok modifies the string passed to it, so passing a string literal is undefined behavior

so instead declare

char line[] = "RGM 3 13 GName 0005 32 funny 0000 44 teste 0000\n";

when you look at the prototype you get a hint about that

char* strtok( char* str, const char* delim );

so the first arg is not const in any way

  •  Tags:  
  • c
  • Related