Get byte representation of C class-CodePudding

I have objects that I need to hash with SHA256. The object has several fields as follows:

class Foo {
    // some methods
    protected:
       std::array<32,int> x;
       char y[32];
       long z;
}

Is there a way I can directly access the bytes representing the 3 member variables in memory as I would a struct ? These hashes need to be computed as quickly as possible so I want to avoid malloc'ing a new set of bytes and copying to a heap allocated array. Or is the answer to simply embed a struct within the class?

It is critical that I get the exact binary representation of these variables so that the SHA256 comes out exactly the same given that the 3 variables are equal (so I can't have any extra padding bytes etc included going into the hash function)

CodePudding user response：

Most Hash classes are able to take multiple regions before returning the hash, e.g. as in:

class Hash {
    public:
        void update(const void *data, size_t size) = 0;
        std::vector<uint8_t> digest() = 0;
}

So your hash method could look like this:

std::vector<uint8_t> Foo::hash(Hash *hash) const {
    hash->update(&x, sizeof(x));
    hash->update(&y, sizeof(y));
    hash->update(&z, sizeof(z));
    return hash->digest();
}

CodePudding user response：

You can solve this by making an iterator that knows the layout of your member variables. Make Foo::begin() and Foo::end() functions and you can even take advantage of range-based for loops.

If you can increment it and dereference it, you can use it any other place you're able to use a LegacyForwardIterator.

Add in comparison functions to get access to the common it = X.begin(); it != X.end(); it idiom.

Some downsides include: ugly library code, poor maintainability, and (in this current form) no regard for endianess.

The solution to the latter downside is left as an exercise to the reader.

#include <array>
#include <iostream>

class Foo {
    friend class FooByteIter;

public:
    FooByteIter begin() const;

    FooByteIter end() const;

    Foo(const std::array<int, 2>& x, const char (&y)[2], long z)
    : x_{x}
    , y_{y[0], y[1]}
    , z_{z}
    {}

protected:
    std::array<int, 2> x_;
    char y_[2];
    long z_;
};

class FooByteIter {
public:
    FooByteIter(const Foo& foo)
        : ptr_{reinterpret_cast<const char*>(&(foo.x_))}
        , x_end_{reinterpret_cast<const char*>(&(foo.x_))   sizeof(foo.x_)}
        , y_begin_{reinterpret_cast<const char*>(&(foo.y_))}
        , y_end_{reinterpret_cast<const char*>(&(foo.y_))   sizeof(foo.y_)}
        , z_begin_{reinterpret_cast<const char*>(&(foo.z_))}
    {}

    static FooByteIter end(const Foo& foo) {
        FooByteIter fbi{foo};
        fbi.ptr_ = reinterpret_cast<const char*>(&foo.z_)   sizeof(foo.z_);

        return fbi;
    }

    bool operator==(const FooByteIter& other) const { return ptr_ == other.ptr_; }
    bool operator!=(const FooByteIter& other) const { return ! (*this == other); }

    FooByteIter& operator  () {
        ptr_  ;
        if (ptr_ == x_end_) {
            ptr_ = y_begin_;
        }
        else if (ptr_ == y_end_) {
            ptr_ = z_begin_;
        }

        return *this;
    }

    FooByteIter operator  (int) {
        FooByteIter pre = *this;
        (*this)  ;
        return pre;
    }

    char operator*() const {
        return *ptr_;
    }

private:
    const char* ptr_;

    const char* const x_end_;
    const char* const y_begin_;
    const char* const y_end_;
    const char* const z_begin_;
};

FooByteIter Foo::begin() const {
    return FooByteIter(*this);
}

FooByteIter Foo::end() const {
    return FooByteIter::end(*this);
}

template <typename InputIt>
char checksum(InputIt first, InputIt last) {
    char check = 0;
    while (first != last) {
        check  = (*first);
          first;
    }

    return check;
}

int main() {
    Foo f{{1, 2}, {3, 4}, 5};
    for (const auto b : f) {
        std::cout << (int)b << ' ';
    }

    std::cout << std::endl;

    std::cout << "Checksum is: " << (int)checksum(f.begin(), f.end()) << std::endl;
}

You can generalize this further by making serialization functions for all data types you might care about, allowing serialization of classes that aren't plain-old-data types.

Warning

This code assumes that the underlying types being serialized have no internal padding, themselves. This answer works for this datatype because it is made of types which themselves do not pad. To make this work for datatypes that have datatypes that have padding, this method would need to be recursed all the way down.

CodePudding user response：

Just cast a pointer to object to a pointer to char. You can iterate through the bytes by increment. Use sizeof(foo) to check overflow.