Home > front end >  Obtaining start position of istringstream token
Obtaining start position of istringstream token

Time:12-01

Is there a way to find the start position of tokens extracted by istringstream::operator >>?

For example, my current failed attempt at checking tellg() (run online):

string test = "   first     \"  in \\\"quotes \"  last";
istringstream strm(test);

while (!strm.eof()) {

    string token;
    auto startpos = strm.tellg();
    strm >> quoted(token);
    auto endpos = strm.tellg();
    if (endpos == -1) endpos = test.length();

    cout << token << ": " << startpos << " " << endpos << endl;

}

So the output of the above program is:

first: 0 8
  in "quotes : 8 29
last: 29 35

The end positions are fine, but the start positions are the start of the whitespace leading up to the token. The output I want would be something like:

first: 3 8
  in "quotes : 13 29
last: 31 35

Here's the test string with positions for reference:

          1111111111222222222233333
01234567890123456789012345678901234  the end is -1

   first     "  in \"quotes "  last

        ^--------------------^-----^ the end positions i get and want
^-------^--------------------^------ the start positions i get
   ^---------^-----------------^---- the start positions i *want*

Is there any straightforward way to retrieve this information when using an istringstream?

CodePudding user response:

First, see Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

Second, you can use the std::ws stream manipulator to swallow whitespace before reading the next token value, then tellg() will report the start positions you are looking for, eg:

#include <string>
#include <sstream>
#include <iomanip>
using namespace std;

...

string test = "   first     \"  in \\\"quotes \"  last";
istringstream strm(test);

while (strm >> ws) {

    string token;
    auto startpos = strm.tellg();
    if (!(strm >> quoted(token)) break;
    auto endpos = strm.tellg();
    if (endpos == -1) endpos = test.length();

    cout << token << ": " << startpos << " " << endpos << endl;
}

Online Demo

  • Related