Home > Net >  Tokenize a std::string to a struct
Tokenize a std::string to a struct

Time:05-06

Let's say I have the following string that I want to tokenize as per the delimiter '>':

std::string veg = "orange>kiwi>apple>potato";

I want every item in the string to be placed in a structure that has the following format:

struct pack_item
{
    std::string it1;
    std::string it2;
    std::string it3;
    std::string it4;
};

I know how to do it this way:

pack_item pitem;

std::stringstream veg_ss(veg);
std::string veg_item;

std::getline(veg_ss, veg_item, '>')
pitem.it1 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it2 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it3 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it4 = veg_item;

Is there a better and one-liner kind of way to do it?

CodePudding user response:

Something like this:

#include <string>
#include <vector>
#include <sstream>
#include <iostream>

std::string veg = "orange>kiwi>apple>potato";

typedef std::vector<std::string> it_vec;

int main(int argc, char* argv[]) {
    it_vec vec;
    
    std::stringstream veg_ss(veg);
    std::string veg_item;

    while (std::getline(veg_ss, veg_item, '>')) {
        vec.push_back(veg_item);
    }
    
    for (const std::string& vec_item : vec) {
        std::cout << vec_item << std::endl;
    }
}

CodePudding user response:

You don't need an intermediate variable.

pack_item pitem;

std::stringstream veg_ss(veg);

std::getline(veg_ss, pitem.it1, '>');
std::getline(veg_ss, pitem.it2, '>');
std::getline(veg_ss, pitem.it3, '>');
std::getline(veg_ss, pitem.it4, '>');

You might want to make that a function, e.g. operator >> (with a similar operator <<)

std::istream& operator >>(std::istream& is, pack_item & pitem) {
    std::getline(is, pitem.it1, '>');
    std::getline(is, pitem.it2, '>');
    std::getline(is, pitem.it3, '>');
    std::getline(is, pitem.it4, '>');
    return is;
}

std::ostream& operator <<(std::ostream& os, pack_item & pitem) {
    return os << pitem.it1 << '>'
              << pitem.it2 << '>'
              << pitem.it3 << '>'
              << pitem.it4 << '>';
}

int main() {
    std::stringstream veg_ss("orange>kiwi>apple>potato>");
    pack_item pitem;
    veg_ss >> pitem;
}

Is there a better and one-liner kind of way to do it?

You can make a type who's >> reads in a string up to a delimiter, and read all four elements in one statement. Is that really "better"?

template <bool is_const>
struct delimited_string;

template<>
struct delimited_string<true> {
    const std::string & string;
    char delim;
};

template<>
struct delimited_string<false> {
    std::string & string;
    char delim;
};

delimited_string(const std::string &, char) -> delimited_string<true>;
delimited_string(std::string &, char) -> delimited_string<false>;

std::istream& operator >>(std::istream& is, delimited_string<false> s) {
    return std::getline(is, s.string, s.delim);
}

template <bool is_const>
std::ostream& operator <<(std::ostream& os, delimited_string<is_const> s) {
    return os << s.string << s.delim;
}

std::istream& operator >>(std::istream& is, pack_item & pitem) {
    return is >> delimited_string { pitem.it1, '>' }
              >> delimited_string { pitem.it2, '>' }
              >> delimited_string { pitem.it3, '>' }
              >> delimited_string { pitem.it4, '>' };
}

std::ostream& operator <<(std::ostream& os, const pack_item & pitem) {
    return os << delimited_string { pitem.it1, '>' }
              << delimited_string { pitem.it2, '>' }
              << delimited_string { pitem.it3, '>' }
              << delimited_string { pitem.it4, '>' };
}

CodePudding user response:

As suggested in the comments, you could use a for loop as such:

pack_item a;
std::array<std::reference_wrapper<std::string>, 4> arr{a.it1, a.it2, a.it3, a.it4};

constexpr std::string_view veg = "orange>kiwi>apple>potato";
std::istringstream ss(veg.data());

std::string str;

for(std::size_t idx = 0; std::getline(ss, str, '>');   idx){
    arr[idx].get() = std::move(str);
}

If you meant "one-liner" in its true sense, then you could be nasty and use:

std::getline(std::getline(std::getline(std::getline(ss, a.it1, '>'), a.it2, '>'), a.it3, '>'), a.it4, '>');

CodePudding user response:

Indeed:

#include <iostream>
#include <sstream>
#include <string>

struct pack_item
{
    std::string it1;
    std::string it2;
    std::string it3;
    std::string it4;
};

pack_item pack( const std::string & s )
{
  pack_item p;
  getline(getline(getline(getline(std::istringstream(s), p.it1,'>'), p.it2,'>'), p.it3,'>'), p.it4);
  return p;
}

int main()
{
  auto pitem = pack( "orange>kiwi>apple>potato" );
  
  std::cout << pitem.it4 << "<" << pitem.it3 << "<" << pitem.it2 << "<" << pitem.it1 << "\n";
}

BTW, there is nothing wrong with multiple lines of code. The quest for the one-liner is often a distraction to doing things the Right Way™.

CodePudding user response:

What I would do is to create a constructor with std::string_view as argument (the second, which is predefined, would be the separator), and use the find function.

The reason of using std::string_view is posted here: How exactly is std::string_view faster than const std::string&?

struct pack_item
{
    std::string it1;
    std::string it2;
    std::string it3;
    std::string it4;

    pack_item():it1(){}

    pack_item(std::string_view in, char sep = '>'){
        
        auto ptr = in.begin();
        auto l_ptr = ptr;
        ptr = std::find(ptr, in.end(), sep);
        it1 = std::string(l_ptr, ptr  );
        l_ptr = ptr;
        ptr = std::find(ptr, in.end(), sep);
        it2 = std::string(l_ptr, ptr  );
        l_ptr = ptr;
        ptr = std::find(ptr, in.end(), sep);
        it3 = std::string(l_ptr, ptr  );
        l_ptr = ptr;
        ptr = std::find(ptr, in.end(), sep);
        it4 = std::string(l_ptr, ptr  );
        
        
    }

};

You can see here that this can be easily converted into a loop if you want and stop it by checking:

if(ptr == in.end()) break;
  • Related