Home > OS >  read .dat file in c and create to multiple data types
read .dat file in c and create to multiple data types

Time:12-26

I'm using C to build my optimization model on Gurobi, and I have a question on how to assign values to coefficients. Currently, I did them in the .cpp file as

const int A = 4;
double B[] = { 1, 2, 3 };
double C[][A] = { 
    { 5, 1, 0, 3 },
    { 7, 0, 2, 4 },
    { 4, 6, 8, 9 } 
};

which means B[1]=1, B[2]=2, B[3]=3, and C[1][1]=5, C[1][2]=1, etc. However, I would like to run the same model for different sets of coefficients, so instead of changing values in the .ccp file, it would be easier if I can read from multiple .dat files. May I know how to do it?

And is that OK if I save the .dat file in the following format?

[4]
[1, 2, 3]
[[5, 1, 0, 3],
[7, 0, 2, 4],
[4, 6, 8, 9]]

CodePudding user response:

I would not recommend that. Some people would recommend using JSON or YAML but if your coefficients will always be so simple, here is a recommendation:

Original file

4
1 2 3
5 1 0 3
7 0 2 4
4 6 8 9
#include <iostream>
#include <sstream>
#include <vector>


struct Coefficients {
    unsigned A;
    std::vector<double> B;
    std::vector< std::vector<double> > C;
};

std::vector<double> parseFloats( const std::string& s ) {
    std::istringstream isf( s );
    std::vector<double> res;
    while ( isf.good() ) {
        double value;
        isf >> value;
        res.push_back( value );
    }
    return res;
}

void readCoefficients( std::istream& fs, Coefficients& c ) {
    fs >> c.A;
    std::ws( fs );
    std::string line;
    std::getline( fs, line );
    c.B = parseFloats( line );
    while ( std::getline( fs, line ) ) {
        c.C.push_back( parseFloats( line ) );
    }
}

One example of usage:

std::string data = R"(
4
1 2 3
5 1 0 3
7 0 2 4
4 6 8 9
)";

int main() {
    Coefficients coef;
    std::istringstream isf( data );
    readCoefficients( isf, coef );
    std::cout << "A:" << coef.A << std::endl;
    std::cout << "B:" << std::endl << "  ";
    for ( double val : coef.B ) {
        std::cout << val << " ";
    }
    std::cout << std::endl;
    std::cout << "C:" << std::endl;
    for ( const std::vector<double>& row : coef.C  ) {
        std::cout << "  ";
        for ( double val : row ) {
            std::cout << val << " ";
        }
        std::cout << std::endl;
    }
}

Result:

Program stdout

A:4
B:
  1 2 3 
C:
  5 1 0 3 
  7 0 2 4 
  4 6 8 9 

Code: https://godbolt.org/z/9s3zffahj

CodePudding user response:

Gurobi. Very interesting!

And you have chosen C to interface it. Good.

Then let us a little bit concentrate on how we would do things in C .

If we look at your first definition:

const int A = 4;
double B[] = { 1, 2, 3 };
double C[][A] = { 
    { 5, 1, 0, 3 },
    { 7, 0, 2, 4 },
    { 4, 6, 8, 9 } 
};

We can see here C-Style arrays. The number of elements of the B array is defined by the number of initializer elements.

The C-elements are a 2-dimensional Matrix. The number of columns is defined by `const int A = 4’. So, I am not sure, if A is just a size or really a coefficient. But in the end, it does not matter.

First important information: In C we are not using C-Style arrays []. We have basically 2 versatile working horses for that:

  1. The std::array, if the size of the array is known at compile time
  2. The C main container, the std::vector. An array that can dynamically grow as needed. That is extremely powerful and used a lot in C . It knows also, how many elements it contains, so now explicit definition like A=4 needed.

And the std::vector is the container to use for this purpose. Please read here about the std::vector.

For that reason, I am not sure, if it is needed in your Coefficients-Information at all. Anyway, I will add it.

Next, C is an object-oriented language. In the very beginning of the language it was even called ‘C with objects’

And one idea of an object-oriented approach is, to put data, and methods, operating on this data in one class (or struct) together.

So, we can define a Coefficient class and store here all coefficient data like your A, B, C. And then, and most important, we will add functions to this class that will operate on this data. In our special case we will add read and write functionality.

As you know, C uses a very versatile IO stream library. And has so called “extraction” operators >> and “inserter” operators <<. The “streams” are implemented as hierarchical classes. That means, it does not matter on which stream (e.g. std::cout, a filestream or a stringstream) you use the << or >> operators, it will work basically everywhere in the same way.

And the extractor >> and inserter <<operators are already overloaded for many many data types. And because of that, you can output many different data types to for example std::cout.

This will of course not work for custom types, like our class “Coefficient”. But here, we can simply add the functionality, by defining the appropriate inserter and extractor operators. And after that, we can use our new type in the same way as other, built in data types.

Then let us look now on the first code example:

struct Coefficient {
    // The data
    int A{};
    std::vector<double> B{};
    std::vector<std::vector<double>> C{};

    friend std::istream& operator >>(std::istream& is, Coefficient& coefficient);
    friend std::ostream& operator << (std::ostream& os, const Coefficient& coefficient);
};

We will show later the implementation for the operators.

Note, this mechanism is also called (de)serialization, because the data will be written/read in a serial and human readable way. We need to take care the output and the input structure of the data is the same, so that we always can take our 2 operators.

You should understand already now, that we later can have an extremely simple handling of IO operations in main or other functions. Let us look at main already now:

// Some example data in a stream
std::istringstream exampleFile{ R"(4
1 2 3  
5 1 0 3  
7 0 2 4  
4 6 8 9  )" };

// Test/Driver code
int main() {
    // Here we have our coefficients
    Coefficient coefficient{};

    // Simply extract all data from the file and store it in our coefficients variable
    exampleFile >> coefficient;

    // One-liner debug output
    std::cout << coefficient;
}

This looks very intuitive and similar to the input and output of other, build-in data types.

Let us now come to the actual input and output functions. And, because we want to keep it simple, we will structure your data in your “.dat” file in an easy to read way. And that is: White space separated data. So: 81 999 42 and so on. Why is that simple? Because in C the formatted input functions (those with the extractor >>) will read such data easily. Example:

int x,y,z;
std::cin >> x >> y >> z

If you give a white space separated input as shown above, it will read the characters, convert it to numbers and store it in the variables.

There is one problem in C . And that is, the end of line character ‘\n’ will in most cases also be treated as a white space. So, reading values in a loop, would not stop at the end of a line. The standard solution for this problem is to use a non-formatted input function like std::getline and first read a complete line into a std::stringvariable. Then, we will put this string into a std::istringstream which is again a stream and extract the values from there.

In your “.dat” file you have many lines with data. So, we need to do the above operation repeatedly. And for things that need to be done repeatedly, we use functions in C . We need to have a function, that receives a stream (any stream) reads the values, store them in a std::vector and returns the vector.

Before I show you this function, I will save some typing work and abbreviate the vector and the 2d-vector with a using statement.

Please see:

// Some abbreviations for easier typing and reading
using DVec = std::vector<double>;
using DDVec = std::vector<DVec>;

// ---------------------------------------------------------------------------
// A function to retrieve a number of double values from a stream for one line
DVec getDVec(std::istream& is) {

    // Read one complete line
    std::string line{}; std::getline(is, line);

    // Put it in an istringstream for better extracting
    std::istringstream iss(line);

    // And use the istream_iterator to iterate over all doubles and put the data in the resulting vector
    return { std::istream_iterator<double>(iss), {} };
}

You see, a simple 3-line function. The last line is maybe difficult to understand for beginners. I will explain it later. So, our function expects a reference to a stream as input parameter and then returns a std::vector<double> containing all doubles from a line.

So, first, we read a complete line into a variable of type std::string. Ultra simple. Then, we put the string into a std::istringstream variable. This will basically convert the string to a stream and allow us, to use all stream functions on that. An remember, why we did that: Because we want to read a complete line and then extract the data from there. Now the last line:

return { std::istream_iterator<double>(iss), {} };

Uh, what’s that? We expect to return a std::vector<double>. The compiler knows that we want to return such a type. And therefore he will kindly create such a variable for us and use the range constructor no 5 () (see here) to initialize our vector. And with what? You can read that it expects 2 iterators. A begin-iterator and an end-iterator. Everything between the iterators will inclusively copied to the vector. And the std::istream_iterator will simply call the extractor operator >> repeatedly and with that reads a double, until all values are read.

Cool!

Next we can use this functionality in our class’ extractor operator >>. This will then look like this;

    // Simple extraction operator
    friend std::istream& operator >>(std::istream& is, Coefficient& coefficient) {

        // Get A and all the B coefficients
        coefficient.B = std::move(getDVec(is >> coefficient.A >> std::ws));

        // And in a simple for loop, readall C-coeeficients
        for (DVec dVec{ getDVec(is) }; is and not dVec.empty(); dVec = getDVec(is))
            coefficient.C.push_back(std::move(dVec));
        return is;
    }

It will first read the line with the B-coefficients and then, in a simple for loop, read all lines of the C-coefficients and add them to the 2d output vector.

std::move will avoid copying of large data and give us a little better efficiency.

Output is even more simple. Using loops to show the data. Not much to explain here.

Now, we have all functions. We make our live simpler, by splitting up a big problem inti smaller problems.

The final complete code would then look like this:

#include <iostream>
#include <sstream>
#include <vector>
#include <algorithm>
#include <iterator>

// Some abbreviations for easier typing and reading
using DVec = std::vector<double>;
using DDVec = std::vector<DVec>;

// ---------------------------------------------------------------------------
// A function to retrieve a number of double values from a stream for one line
DVec getDVec(std::istream& is) {

    // Read one complete line
    std::string line{}; std::getline(is, line);

    // Put it in an istringstream for better extracting
    std::istringstream iss(line);

    // And use the sitream_iterator to iterate over all doubles and put the data in the resulting vector
    return { std::istream_iterator<double>(iss), {} };
}

// -------------------------------------------------------------
// Cooeficient class. Holds data and methods to operate on this data
struct Coefficient {
    // The data
    int A{};
    DVec B{};
    DDVec C{};

    // Simple extraction operator
    friend std::istream& operator >>(std::istream& is, Coefficient& coefficient) {

        // Get A and all the B coefficients
        coefficient.B = std::move(getDVec(is >> coefficient.A >> std::ws));

        // And in a simple for loop, readall C-coeeficients
        for (DVec dVec{ getDVec(is) }; is and not dVec.empty(); dVec = getDVec(is))
            coefficient.C.push_back(std::move(dVec));
        return is;
    }

    // Even more simple inserter operator. Output values in loops
    friend std::ostream& operator << (std::ostream& os, const Coefficient& coefficient) {
        os << coefficient.A << '\n';
        for (const double d : coefficient.B) os << d << ' '; os << '\n';
        for (const DVec& dv : coefficient.C) {
            for (const double d : dv) os << d << ' '; os << '\n'; }
        return os;
    }
};
// Some example data in a stream
std::istringstream exampleFile{ R"(4
1 2 3  
5 1 0 3  
7 0 2 4  
4 6 8 9  )" };

// Test/Driver code
int main() {
    // Here we have our coefficients
    Coefficient coefficient{};

    // Simply extract all data from the file and store it in our coefficients variable
    exampleFile >> coefficient;

    // One-liner debug output
    std::cout << coefficient;
}

Please again see the simple statements in main.

I hope I could help you a little.

Some additional notes. In professional software development, code without comments is considered to have 0 quality.

Also, the guidelines on SO recommend, to not just dump code, but also give a comprehensive explanation.

If you should have further questions then ask, I am happy to answer.

  • Related