Home > Net >  How to initialize a struct with random data in C?
How to initialize a struct with random data in C?

Time:08-11

Assuming we have the following complex struct (in C). It includes fixed-length arrays, members of varying sizes, enums and other structs. This struct is not packed, and cannot be packed.

struct {
    uint8_t smallNum;
    /* uint8_t align0 */
    /* uint8_t align1 */
    /* uint8_t align2 */
    uint32_t arrayOfBigNums[5];
    bool isFalse;
    /* uint8_t align0 */
    /* uint8_t align1 */
    /* uint8_t align2 */
    struct myOtherStruct;
    /* uint8_t align0 */
    enum mySmallEnum;
...
    uint8_t aByte;
    /* uint8_t align0 */
    /* uint8_t align1 */
} myStruct;

We would like to initialize this struct with random data for unit testing. The test includes taking this struct, serializing it, writing it to flash, reading it, deserializing it, and then checking that the data hasn't changed.

Either generate all possible values, or generate a small randomized subset and test it. So how would one go about doing this?

The immediate solution would be the following:

myStruct testData = {0};
myStruct readData = {0};
testData.smallNum = rand();
testData.arrayOfBigNums[0] = rand();
testData.arrayOfBigNums[1] = rand();
....
testData.aByte = rand();
save(&testData);
load(&readData);
int ret = memcmp(&readData, &testData, sizeof(myStruct)); 
// ret == 0. Good!

But this solution isn't scalable. Adding another field to the struct, or increasing the array would require changing the unit test. Also, it'd be a long init function, prone to human error.

Another solution would be to generate a random array of bytes, and then memcpy it into our struct. Although this is a great idea, in practice it won't work because our struct isn't packed. Some bytes are and always will be 0.

myStruct testData = {0};
myStruct readData = {0};
uint8_t* randomData = genRandomData(sizeof(myStruct));
memcpy((uint8_t*)&testData, randomData, sizeof(myStruct));
save(&testData);
load(&readData);
int ret = memcmp(&readData, &testData, sizeof(myStruct)); 
// ret != 0 because deserialization will fill the struct properly. 
// Ignoring junk bytes in padding

My current working solution is to iteratively improve on the previous one. Initialize a struct with 0xff, cast it to a byte array and mask it with everything. You get a compiler warning but to my knowledge it's harmless.

myStruct structMask = {MACRO_TO_FILL_WITH_MANY(-1)};
myStruct testData = {0};
myStruct readData = {0};
uint8_t* randomData = genRandomData(sizeof(myStruct));
memcpy((uint8_t*)&testData, randomData, sizeof(myStruct));
for(size_t i = 0; i < sizeof(myStruct); i  ){
    ((uint8_t*)&testData)[i] &= ((uint8_t*)&structMask)[i];
}
save(&testData);
load(&readData);
int ret = memcmp(&readData, &testData, sizeof(myStruct)); 
// ret != 0 because deserialization will fill the struct properly. 
// Ignoring junk bytes in padding

This solution is also not optimal, because it doesn't take into account enum limitations. Types like bool (from stdbool.h) are stored as 0/1 which is nice, but enums can still have undefined values. So we'd need to manually modulo all enum values.

Looking for more generic / robust solutions.

CodePudding user response:

Considering you want the random data to obey by the rules of the types, there is no way to make that work in a scalable manner.

What I would do is to just copy random bytes into the struct. Yes, some data is going to be invalid, and that too should be a test case to see how you handle corrupted/invalid data:

#include <stdio.h>
#include <string.h>

struct Foo {
    int a;
    double d;
    char ch;
    _Bool b;
};

int main() {
    FILE* random = fopen("/dev/urandom", "rb");
    struct Foo foo;

    if (!random || fread(&foo, sizeof(foo), 1, random) != 1) {
        fputs("Failed to generate random data", stderr);
        return -1;
    }
    
    fclose(random);

    printf("foo.d = %d\n", foo.d);
}

CodePudding user response:

C programming does not have reflection. In C, you would write a normal initialization function where manually for every member you initialize the struct.

I would just write a macro CHOOSE_RANDOM that will choose a random from a set of values and initialize the struct random with memset and then limit the members that have some constraints.

But this solution isn't scalable

Looking for more generic / robust solutions.

It is said that AI is so smart it will replace developers in the future. Write a program that inspects your C source code and generates the initialization function. Such a program, because it would be outside C, will be able to inspect the source code and make type aware decisions.

The following python code:

#!/usr/bin/env python3

from pycparser import parse_file, c_ast
import tempfile
import os

text = r"""
#include <stdbool.h>
#include <stdint.h>

enum mySmallEnum { A, B, C };

struct {
    uint8_t aByte;
    bool isFalse;
    enum mySmallEnum myenum;
} myStruct;


"""

structname = "myStruct";

with tempfile.NamedTemporaryFile('w') as tmp:
    tmp.write(text)
    tmp.flush()
    ast = parse_file(tmp.name, use_cpp=True)
    for a in ast:
        if isinstance(a, c_ast.Decl) and a.name == structname:
            members = a.type.type.decls
            print("// Initialize {}".format(structname))
            for d in members:
                name = d.type.declname
                type = d.type.type
                if isinstance(type, c_ast.IdentifierType):
                    type = type.names[0]
                    if type == "uint8_t":
                        print("{}.{} = rand();".format(structname, name))
                    elif type == "_Bool":
                        print("{}.{} = rand() ? 1 : 0;".format(structname, name))
                    else:
                        assert 0, "TODO"
                elif isinstance(type, c_ast.Enum):
                    typename = type.name
                    vals = []
                    for a in ast:
                        if isinstance(a, c_ast.Decl) and isinstance(a.type, c_ast.Enum) and a.type.name == typename:
                            vals  = [x.name for x in a.type.values.enumerators]
                            break
                    print("{}.{} = CHOOSE_RANDOM({});".format(structname, name, ', '.join(vals)))
                else:
                    assert 0, "TODO"

outputs:

// Initialize myStruct
myStruct.aByte = rand();
myStruct.isFalse = rand() ? 1 : 0;
myStruct.myenum = CHOOSE_RANDOM(A, B, C);

CodePudding user response:

Okay, I actually found the answer to my own question. The main problem is that there isn't an initializer for my struct. However, there already exist serialization functions. So we can use the non-working second solution, and convert it to a working solution.

So a much cleaner solution would be:

myStruct testData = {0};
myStruct readData = {0};
uint8_t* randomData = genRandomData(sizeof(myStruct));
memcpy(&testData, randomData, sizeof(myStruct));
save(&testData);
load(&readData); // readData is now a valid random struct.

myStruct realReadData = {0}
save(&readData);
load(&realReadData);

memcmp(readReadData, readData, sizeof(myStruct)); // This does equal 0 if serialize/deserialize functions are correct.

If there's an issue in serialize/deserialize functions, the input struct (corrupt as it may be) should provide different results.

  • Related