Home > database >  Reading a tab delimited text file with leading tabs on some lines
Reading a tab delimited text file with leading tabs on some lines

Time:11-05

I am working to build a console-based spreadsheet app that takes in a UTF-8 encoded text file as input and outputs the results to the console.

Column values are separated by tabs and each new line is a new row. I am having some issues reading in the tab-delimited input text file where some of the lines (rows) are starting with a tab indicating that there is no value in the first column(s). I would like to just extract the "filled" cells and use the data elsewhere in the program and discard or ignore the "empty" cells. Using the '\t' delimiter in the getline() function does not seem to ignore these leading tabs. Thank you ahead of time for any help or code suggestions.

Example Input:

1 \t 2
\t 3
\t \t =A1 B1 B2 

The simple code I've been using is below:

#include <iostream>
#include <stream>
#include <string>

// Variable declarations
std::ifstream sheetFile;
std::string input;

int main(int argc, char *argv[])
{
    sheetFile.open(argv[1]);
    while (getline(sheetFile, input, '\t'))
    {
        std::cout << input << std::endl;
    }

    sheetFile.close();
    return 0;
}

And the return to the console is:

1
2

3


=A1 B1 B2

CodePudding user response:

You can use multiple std::getline() calls - one in the loop to read each line delimited by \n, and then put each line into a std::istringstream and use std::getline() on thaat stream to parse each column delimited on \t, eg:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

int main(int argc, char *argv[])
{
    // Variable declarations
    std::ifstream sheetFile(argv[1]);
    std::string line, input;

    while (std::getline(sheetFile, line))
    {
        std::istringstream iss(line);
        while (std::getline(iss, input, '\t'))
        {
            if (!input.empty())
                std::cout << input << std::endl;
        }
    }

    return 0;
}

Alternatively, using a single std::getline(), you can use the std::ws stream manipulator to ignore leading whitespace on each line, which will include \t and \n characters:

#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>

int main(int argc, char *argv[])
{
    // Variable declarations
    std::ifstream sheetFile(argv[1]);
    std::string input;

    while (std::getline(sheetFile >> std::ws, input, '\t'))
    {
        std::cout << input << std::endl;
    }

    return 0;
}
  • Related