Home > Mobile >  How to do serialization of Class recursively having members of custom data types in C ?
How to do serialization of Class recursively having members of custom data types in C ?

Time:10-06

I want to serialize and deserialize a class Mango . So I have created a function serialize and deserialize respectively.

? serialize(Mango &Man) /// What should be return ?
{
}

    
Mango deserialize(  ?   ) /// What should be function parameter ?
{
}

I don't know how to implement it very efficiently in terms of speed, portability , memory because it contains 10 members of custom data types ( I just mention one but they are all similar) which again are very complex.

I want suggestions for implementation for eg : what should be the return type of serialize function ? vector of bytes ie std::vector<uint8_t> serialize(Mango &Man) ? or should it be nothing like just serializing it into bytes and storing it in memory? or any other way?

Mango Class

class Mango
{
public:
    const MangoType &getMangoType() const { return typeMan; }
    MangoType &getMangoType() { return typeMan; }

private:
    // There are many members of different types : I just mention one.
    MangoType typeMan;
};

Data type classes

//MangoType Class
class MangoType
{
    /// It only has one member ie content
public:
    /// Getter of content vector.

    std::vector<FuntionMango> &getContent() noexcept { return Content; }

private:
    /// \name Data of MangoType.
    
    std::vector<FuntionMango> Content;
    
};


/// FuntionMango class.
class FuntionMango
{
public:
    /// Getter of param types.
    const std::vector<ValType> &getParamTypes() const noexcept
    {
        return ParamTypes;
    }
    std::vector<ValType> &getParamTypes() noexcept { return ParamTypes; }

    /// Getter of return types.
    const std::vector<ValType> &getReturnTypes() const noexcept
    {
        return ReturnTypes;
    }
    std::vector<ValType> &getReturnTypes() noexcept { return ReturnTypes; }

    

private:
    /// \name Data of FuntionMango.
   
    std::vector<ValType> ParamTypes;
    std::vector<ValType> ReturnTypes;

};

//ValType Class
  
enum class ValType : uint8_t
  {
     #define UseValType
     #define Line(NAME, VALUE, STRING) NAME = VALUE
     #undef Line
     #undef UseValType
  };

I want to know the best possible implementation plan in terms of speed and memory for serialize and deserialize functions.

Note : 1) I do not want to transfer it over the network. My usecase is that it is very time consuming to load data everytime in Mango class ( It comes after computation ). So I want to serialize it .. so that next time I want it , I can just deserialize the previous serialized data 2) I do not want to use library which requires linking like boost serialization directly. But is there any way to use it as header only ?

CodePudding user response:

I commented:

Perhaps the examples here give you some inspiration. It's possible to write them without any boost, obviously Boost Serialization Binary Archive giving incorrect output

Because I hate when people say "obviously" on a Q&A site, let me show you:

Live On Coliru

#include <algorithm>
#include <boost/endian/arithmetic.hpp>
#include <iomanip> // debug output
#include <iostream>
#include <string>
#include <vector>

namespace my_serialization_helpers {

    ////////////////////////////////////////////////////////////////////////////
    // This namespace serves as an extension point for your serialization; in
    // particular we choose endianness and representation of strings
    //
    // TODO add overloads as needed (signed integer types, binary floats,
    // containers of... etc)
    ////////////////////////////////////////////////////////////////////////////
    
    // decide on the max supported string capacity:
    using string_size_type = boost::endian::big_uint32_t;
    
    ////////////////////////////////////////////////////////////////////////////
    // generators
    template <typename Out>
    Out do_generate(Out out, std::string const& data) {
        string_size_type len = data.length();
        out = std::copy_n(reinterpret_cast<char const*>(&len), sizeof(len), out);
        return std::copy(data.begin(), data.end(), out);
    }

    template <typename Out, typename T>
    Out do_generate(Out out, std::vector<T> const& data) {
        string_size_type len = data.size();
        out = std::copy_n(reinterpret_cast<char const*>(&len), sizeof(len), out);
        for (auto& el : data)
            out = do_generate(out, el);
        return out;
    }

    template <typename Out>
    Out do_generate(Out out, uint8_t const& data) {
        return std::copy_n(&data, sizeof(data), out);
    }

    template <typename Out>
    Out do_generate(Out out, uint16_t const& data) {
        boost::endian::big_uint16_t n(data);
        return std::copy_n(reinterpret_cast<char const*>(&n), sizeof(n), out);
    }

    template <typename Out>
    Out do_generate(Out out, uint32_t const& data) {
        boost::endian::big_uint32_t n(data);
        return std::copy_n(reinterpret_cast<char const*>(&n), sizeof(n), out);
    }

    ////////////////////////////////////////////////////////////////////////////
    // parsers
    template <typename It>
    bool parse_raw(It& in, It last, char* raw_into, size_t n) { // length guarded copy_n
        while (in != last && n) {
            *raw_into   = *in  ;
            --n;
        }
        return n == 0;
    }

    template <typename It, typename T>
    bool parse_raw(It& in, It last, T& into) {
        static_assert(std::is_trivially_copyable_v<T>);
        return parse_raw(in, last, reinterpret_cast<char*>(&into), sizeof(into));
    }

    template <typename It>
    bool do_parse(It& in, It last, std::string& data) {
        string_size_type len;
        if (!parse_raw(in, last, len))
            return false;
        data.resize(len);
        return parse_raw(in, last, data.data(), len);
    }

    template <typename It, typename T>
    bool do_parse(It& in, It last, std::vector<T>& data) {
        string_size_type len;
        if (!parse_raw(in, last, len))
            return false;
        data.clear();
        data.reserve(len);
        while (len--) {
            data.emplace_back();
            if (!do_parse(in, last, data.back()))
                return false;
        };
        return true;
    }

    template <typename It>
    bool do_parse(It& in, It last, uint8_t& data) {
        return parse_raw(in, last, data);
    }

    template <typename It>
    bool do_parse(It& in, It last, uint16_t& data) {
        boost::endian::big_uint16_t big_data;
        bool ok = parse_raw(in, last, big_data);
        if (ok)
            data = big_data;
        return ok;
    }

    template <typename It>
    bool do_parse(It& in, It last, uint32_t& data) {
        boost::endian::big_uint32_t big_data;
        bool ok = parse_raw(in, last, big_data);
        if (ok)
            data = big_data;
        return ok;
    }
}

struct SomeNestedType {
    std::vector<std::string> stuff;
    uint8_t more;

    template <typename Out>
    friend Out do_generate(Out out, SomeNestedType const& snt) {
        using my_serialization_helpers::do_generate;
        out = do_generate(out, snt.stuff);
        out = do_generate(out, snt.more);
        return out;
    }

    template <typename It>
    friend bool do_parse(It& in, It last, SomeNestedType& snt) {
        using my_serialization_helpers::do_parse;
        auto ok =
            do_parse(in, last, snt.stuff)
            && do_parse(in, last, snt.more)
            ;
        return ok;
    }
};

struct SomeCustomType {
    void DoSaveTests();

    SomeNestedType mNested;

    uint16_t    mProtocolVersion;
    uint16_t    mSessionFlags;
    uint16_t    mMaxResponseLength;
    std::string mMake;
    std::string mModel;
    std::string mSerialNumber;
    uint8_t     mTrackDelay;
    std::string mHeadUnitModel;
    std::string mCarModelYear;
    std::string mVin;
    uint16_t    mVehicleMileage;
    uint8_t     mShoutFormat;
    uint8_t     mNotificationInterval;

    template <typename Out>
    friend Out do_generate(Out out, SomeCustomType const& sct) {
        using my_serialization_helpers::do_generate;
        out = do_generate(out, sct.mNested);
        out = do_generate(out, sct.mProtocolVersion);
        out = do_generate(out, sct.mSessionFlags);
        out = do_generate(out, sct.mMaxResponseLength);
        out = do_generate(out, sct.mMake);
        out = do_generate(out, sct.mModel);
        out = do_generate(out, sct.mSerialNumber);
        out = do_generate(out, sct.mTrackDelay);
        out = do_generate(out, sct.mHeadUnitModel);
        out = do_generate(out, sct.mCarModelYear);
        out = do_generate(out, sct.mVin);
        out = do_generate(out, sct.mVehicleMileage);
        out = do_generate(out, sct.mShoutFormat);
        out = do_generate(out, sct.mNotificationInterval);
        return out;
    }

    template <typename Container>
    bool parse(Container const& bytes) {
        using std::begin;
        using std::end;
        auto in = begin(bytes), last = end(bytes);

        return do_parse(in, last, *this);
    }

    template <typename It>
    friend bool do_parse(It& in, It last, SomeCustomType& sct) {
        using my_serialization_helpers::do_parse;
        auto ok =
            do_parse(in, last, sct.mNested)
            && do_parse(in, last, sct.mProtocolVersion)
            && do_parse(in, last, sct.mSessionFlags)
            && do_parse(in, last, sct.mMaxResponseLength)
            && do_parse(in, last, sct.mMake)
            && do_parse(in, last, sct.mModel)
            && do_parse(in, last, sct.mSerialNumber)
            && do_parse(in, last, sct.mTrackDelay)
            && do_parse(in, last, sct.mHeadUnitModel)
            && do_parse(in, last, sct.mCarModelYear)
            && do_parse(in, last, sct.mVin)
            && do_parse(in, last, sct.mVehicleMileage)
            && do_parse(in, last, sct.mShoutFormat)
            && do_parse(in, last, sct.mNotificationInterval)
            ;
        return ok;
    }
};

#include <cassert>

int main() {
    SomeCustomType req {
        SomeNestedType { { "one", "two", "three" }, 1 2 3 },
        1 * 10000   14 * 100   4,
        1,                        
        0,                        
        "MyMake",                 
        "MyModel",                
        "10000",                 
        0,
        "Headunit",               
        "2014",                  
        "1234567980",           
        1000,                 
        3,                   
        1,                  
    };

    std::vector<uint8_t> bytes;
    do_generate(back_inserter(bytes), req);
    {
        std::cout << "\nSerialized " << std::dec << bytes.size() << " bytes:\n";
        for (auto ch : bytes)
            std::cout << "0x" << std::hex << std::setw(2) << std::setfill('0')
                      << static_cast<int>((uint8_t)ch) << " ";

        SomeCustomType clone;
        if (clone.parse(bytes))
        {
            std::vector<uint8_t> roundtrip;
            do_generate(back_inserter(roundtrip), req);
            assert(roundtrip == bytes);
        } else {
            std::cout << "Roundtrip deserialization failed\n";
        }
    }
}

Which roundtrips correctly and prints the debug output:

Serialized 103 bytes:
0x00 0x00 0x00 0x03 0x00 0x00 0x00 0x03 0x6f 0x6e 0x65 0x00 0x00 0x00 0x03 0x74 0x77 0x6f 0x00 0x00 0x00 0x05 0x74 0x68 0x72 0x65 0x65 0x06 0x2c 0x8c 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x06 0x4d 0x79 0x4d 0x61 0x6b 0x65 0x00 0x00 0x00 0x07 0x4d 0x79 0x4d 0x6f 0x64 0x65 0x6c 0x00 0x00 0x00 0x05 0x31 0x30 0x30 0x30 0x30 0x00 0x00 0x00 0x00 0x08 0x48 0x65 0x61 0x64 0x75 0x6e 0x69 0x74 0x00 0x00 0x00 0x04 0x32 0x30 0x31 0x34 0x00 0x00 0x00 0x0a 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x39 0x38 0x30 0x03 0xe8 0x03 0x01 

Boost Use?

The only piece of Boost used is to make it easy to do endiannes conversions using Boost Endian. This does not require linking to any boost library (Boost Endian is header-only).

If you want to do without, you can opt to forget about endianness (assuming portability is not a requirement): Live On Coliru without boost.

Otherwise, you need to do implement endian-ness conversion yourself (e.g. ntoh and hton).

CodePudding user response:

As @Eljay says in a comment, the exact solution depends on a use case.

For me, if it is a one-off project, the most straight-forward "binary dump" method would be to reconsider your basic datatypes and store everything compactly, using a fixed-size structures.

struct FuntionMango
{
    int NumParams; // valid items in Param/Return arrays
    int NumReturns;

    ValType ParamTypes[MAX_PARAMS];
    ValType ReturnTypes[MAX_RETURNS];
};

struct MangoType
{
    int NumContent; // valid items in Content array
    // Fixed array instead of vector<FuntionMango>
    FuntionMango Content[MAX_FUNCTIONS];
};

struct Mango // all fields are just 'public'
{
    MangoType typeMan;
};

Then the "save" procedure would be

void saveMango(const char* filename, Mango* mango)
{
    FILE* OutFile = fopen(...);
    fwrite(mango, 1, sizeof(Mango), OutFile);
    fclose(OutFile);
}

and load just uses "fread" (of course, all error handling and file integrity checking is omitted)

void loadMango(const char* filename, Mango* mango)
{
    FILE* InFile = fopen(...);
    fread(mango, 1, sizeof(Mango), InFile);
    fclose(InFile);
}

To convert you Mango into a byte array, just use a reinterpret_cast or a C-style cast.

Unfortunately, this approach would fail if any of your structure contain either pointer fields (and have non-trivial constructors).

  • Related