I'm just new to C and am studying how to read data from csv file. I want to read the following csv data into vector. Each row is a vector. The file name is path.csv:
0
0 1
0 2 4
0 3 6 7
I use the following function:
vector<vector<int>> read_multi_int(string path) {
vector<vector<int>> user_vec;
ifstream fp(path);
string line;
getline(fp, line);
while (getline(fp, line)) {
vector<int> data_line;
string number;
istringstream readstr(line);
while (getline(readstr, number, ',')) {
//getline(readstr, number, ',');
data_line.push_back(atoi(number.c_str()));
}
user_vec.push_back(data_line);
}
return user_vec;
}
vector<vector<int>> path = read_multi_int("C:/Users/data/paths.csv");
Print funtion:
template <typename T>
void print_multi(T u)
{
for (int i = 0; i < u.size(); i) {
if (u[i].size() > 1) {
for (int j = 0; j < u[i].size(); j) {
//printf("%d ", u[i][j]);
cout << u[i][j] << " ";
}
printf("\n");
}
}
printf("\n");
}
Then I get
0 0 0
0 1 0
0 2 4
0 3 6 7
Zeros are added at the end of the rows. Is possible to just read the data from the csv file without adding those extra zeros? Thanks!
CodePudding user response:
Based on the output you are seeing and the code with ',' commas, I beleive that your actual input data really looks like this:
A,B,C,D
0,,,
0,1,,
0,2,4,
0,3,6,7
So the main change is to replace atoi
with strtol
, as atoi
will always return 0
on a failure to parse a number, but with strtol
we can check if the parse succeeded.
That means that the solution is as follows:
vector<vector<int>> read_multi_int(string path) {
vector<vector<int>> user_vec;
ifstream fp(path);
string line;
getline(fp, line);
while (getline(fp, line)) {
vector<int> data_line;
string number;
istringstream readstr(line);
while (getline(readstr, number, ',')) {
char* temp;
char numberA[30];
int numberI = strtol(number.c_str(), &temp, 10);
if (temp == number || *temp != '\0' ||
((numberI == LONG_MIN || numberI == LONG_MAX) && errno == ERANGE))
{
// Could not convert
}else{
data_line.emplace_back(numberI);
}
}
user_vec.emplace_back(data_line);
}
return user_vec;
}
Then to display your results:
vector<vector<int>> path = read_multi_int("C:/Users/data/paths.csv");
for (const auto& row : path)
{
for (const auto& s : row) std::cout << s << ' ';
std::cout << std::endl;
}
Give the expected output:
0
0 1
0 2 4
0 3 6 7
CodePudding user response:
Already very good, but there is one obvious error and another error in your print function. Please see, how I output the values, with simple range based for loops.
If your source file does not contain a comma (','
), but a different delimiter, then you need to call std::getline
with this different delimiter, in your case a blank (' '
). Please read here about std::getline
.
If we then use the following input
Header
0
0 1
0 2 4
0 3 6 7
with the corrected program.
#include <vector>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
vector<vector<int>> read_multi_int(string path) {
vector<vector<int>> user_vec;
ifstream fp(path);
string line;
getline(fp, line);
while (getline(fp, line)) {
vector<int> data_line;
string number;
istringstream readstr(line);
while (getline(readstr, number, ' ')) {
//getline(readstr, number, ',');
data_line.push_back(atoi(number.c_str()));
}
user_vec.push_back(data_line);
}
return user_vec;
}
int main() {
vector<vector<int>> path = read_multi_int("C:/Users/data/paths.csv");
for (vector<int>& v : path) {
for (int i : v) std::cout << i << ' ';
std::cout << '\n';
}
}
then we receive this as output:
0
0 1
0 2 4
0 3 6 7
Which is correct, but unfortunately different from your shown output.
So, your output routine, or some other code, may also have some problem.
Besides. If there is no comma, then you can take advantage of formatted input functions using the extraction operator >>
. This will read your input until the next space and convert it automatically to a number.
Additionally, it is strongly recommended, to initialize all variables during definition. You should do this always.
Modifying your code to use formatted input, initialization, and, maybe, better variable names, then it could look like the below.
#include <vector>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
vector<vector<int>> multipleLinesWithIntegers(const string& path) {
// Here we will store the resulting 2d vector
vector<vector<int>> result{};
// Open the file
ifstream fp{ path };
// Read header line
string line{};
getline(fp, line);
// Now read all lines with numbers in the file
while (getline(fp, line)) {
// Here we will store all numbers of one line
vector<int> numbers{};
// Put the line into an istringstream for easier extraction
istringstream sline{ line };
int number{};
while (sline >> number) {
numbers.push_back(number);
}
result.push_back(numbers);
}
return result;
}
int main() {
vector<vector<int>> values = multipleLinesWithIntegers("C:/Users/data/paths.csv");
for (const vector<int>& v : values) {
for (const int i : v) std::cout << i << ' ';
std::cout << '\n';
}
}
And, the next step would be to use a some more advanced style:
#include <vector>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
#include <iterator>
auto multipleLinesWithIntegers(const std::string& path) {
// Here we will store the resulting 2d vector
std::vector<std::vector<int>> result{};
// Open the file and check, if it could be opened
if (std::ifstream fp{ path }; fp) {
// Read header line
if (std::string line{}; getline(fp, line)) {
// Now read all lines with numbers in the file
while (getline(fp, line)) {
// Put the line into an istringstream for easier extraction
std::istringstream sline{ line };
// Get the numbers and add them to the result
result.emplace_back(std::vector(std::istream_iterator<int>(sline), {}));
}
}
else std::cerr << "\n\nError: Could not read header line '" << line << "'\n\n";
}
else std::cerr << "\n\nError: Could not open file '" << path << "'\n\n'";
return result;
}
int main() {
const std::vector<std::vector<int>> values{ multipleLinesWithIntegers("C:/Users/data/paths.csv") };
for (const std::vector<int>& v : values) {
for (const int i : v) std::cout << i << ' ';
std::cout << '\n';
}
}
Edit
You have shown your output routine. That should be changed to:
void printMulti(const std::vector<std::vector<int>>& u)
{
for (int i = 0; i < u.size(); i) {
if (u[i].size() > 0) {
for (int j = 0; j < u[i].size(); j) {
std::cout << u[i][j] << ' ';
}
std::cout << '\n';
}
}
std::cout << '\n';
}