Allocating dynamic array of structs with dynamic arrays in C-CodePudding

I am trying to allocate an array of structs, with each struct also containing dynamic arrays. They will later be communicated via MPI_Sendrecv:

struct cell {
    double a, b, c, *aa, *bb;
} *Send_l, *Send_r;

I want Send_l and Send_r to have count number of elements, the arrays aa and bb should contain sAS number of elements. This is all done after MPI_Init.

void allocateForSendRecv(int count) {
    int sAS = 5;
    int iter = 0;

    Send_l = (struct cell *)malloc(count * (sizeof(struct cell)));
    for (iter = 0; iter < count; iter  ) {
        Send_l[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
        Send_l[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
    }
    //sAS-1, as sizeof(struct cell) already contains a single (double) for aa and bb.

    Send_r = (struct cell *)malloc(count * (sizeof(struct cell)));
    for (iter = 0; iter < count; iter  ) {
        Send_r[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
        Send_r[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
    }
}

With this, I can freely allocate, fill and deallocate, however when I call the following, my results diverge from my reference (using all stack arrays).

MPI_Sendrecv(&(Send_r[0]), count, ..., &(Send_l[0]), count, ...)

I haven't found the exact reason, but posts about similar issues made me assume its due to my non-contiguous memory allocation. Ive tried to solve the problem by using a single malloc call, only to get a segmentation fault when I fill my arrays aa and bb:

    Send_l = malloc(count * (sizeof(*Send_l))   count *(sizeof(*Send_l)   2 * (sAS - 1) * sizeof(double)));

    Send_r = malloc(count * (sizeof(*Send_r))   count *(sizeof(*Send_r)   2 * (sAS - 1) * sizeof(double)));

I have reused some code to allocate 2D arrays and applied it to this struct problem, but haven't been able to make it work. Am I right in assuming that, with a functioning single malloc call and therefore contiguous memory allocation, my MPI_Sendrecv would work fine? Alternatively, would using MPI_Type_create_struct solve my non-contiguous memory problem?

Minimal example (without MPI) of segmentation fault. Using allocateSendRecv, everything is fine. But the single alloc in allocateInOneSendRecv gives me issues.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

struct cell {
    double a, b, c, *aa, *bb;
} *Send_l, *Send_r;

void allocateSendRecv(int count, int sAS);
void fillSendRecv(int count, int sAS);
void freeSendRecv(int count);
void printSendRecv(int count, int sAS);
void allocateInOneSendRecv(int count, int sAS);

int main(int argc, char *argv[])
{
    const int count = 2;
    const int sAS = 9;
    allocateSendRecv(count, sAS);
    //allocateInOneSendRecv(count, sAS);
    fillSendRecv(count, sAS);
    printSendRecv(count, sAS);
    freeSendRecv(count);
    return 0;
}

void allocateSendRecv(int count, int sAS) {
    int iter = 0;

    printf("Allocating!\n");

    Send_r = (struct cell *)malloc(count * (sizeof(struct cell)));
    for (iter = 0; iter < count; iter  ) {
        Send_r[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
        Send_r[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
    }

    Send_l = (struct cell *)malloc(count * (sizeof(struct cell)));
    for (iter = 0; iter < count; iter  ) {
        Send_l[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
        Send_l[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
    }
}

void allocateInOneSendRecv(int count, int sAS) {
    printf("Allocating!\n");

    Send_l = malloc(count * (sizeof(*Send_l))   count *(sizeof(*Send_l)   2 * (sAS - 1) * sizeof(double)));

    Send_r = malloc(count * (sizeof(*Send_r))   count *(sizeof(*Send_r)   2 * (sAS - 1) * sizeof(double)));
}

void freeSendRecv(int count) {
    int iter = 0;

    printf("Deallocating!\n");

    free(Send_r);

    free(Send_l);
}

void fillSendRecv(int count, int sAS) {
    int iter = 0;
    int iter2= 0;
    double dummyDouble = 5.0;

    printf("Filling!\n");

    for (iter = 0; iter < count; iter  ) {
        Send_l[iter].a = dummyDouble;
        Send_l[iter].b = dummyDouble;
        Send_l[iter].c = dummyDouble;
        for (iter2 = 0; iter2 < sAS; iter2  ) {
            Send_l[iter].aa[iter2] = dummyDouble;
            Send_l[iter].bb[iter2] = dummyDouble;
        }

        dummyDouble  ;

        Send_r[iter].a = dummyDouble;
        Send_r[iter].b = dummyDouble;
        Send_r[iter].c = dummyDouble;
        for (iter2 = 0; iter2 < sAS; iter2  ) {
            Send_r[iter].aa[iter2] = dummyDouble;
            Send_r[iter].bb[iter2] = dummyDouble;
        }
        dummyDouble  ;
    }
}

void printSendRecv(int count, int sAS) {
    int iter = 0;

    printf("Printing!\n");

    for (iter = 0; iter < count; iter  ) {
        printf("%f \n", Send_l[iter].a);
        printf("%f \n", Send_l[iter].b);
        printf("%f \n", Send_l[iter].c);
        printf("%f \n", Send_l[iter].aa[sAS - 1]);
        printf("%f \n\n", Send_l[iter].bb[sAS - 1]);

        printf("%f \n", Send_r[iter].a);
        printf("%f \n", Send_r[iter].b);
        printf("%f \n", Send_r[iter].c);
        printf("%f \n", Send_r[iter].aa[sAS - 1]);
        printf("%f \n\n", Send_r[iter].bb[sAS - 1]);
    }
}

CodePudding user response：

Your current problem is that you can only pass the start address of Send_l (resp. Send_r). From that point, all memory has to be contiguous and you must know its total size and give it later to MPI_SendRecv.

But after allocation, you must ensure that aa and bb members are correctly initialized to point inside the allocated bloc of memory.

A possible code could be:

void allocateSendRecv(int count, int subCount) {
    int iter;

    // total size of each struct
    size_t sz = sizeof(struct cell)   2 * subCount * sizeof(double);

    // one single contiguous allocation
    Send_r = malloc(count * sz); // nota: never cast malloc in C language!

    // per each cell make aa and bb point into the allocated memory
    for (iter = 0; iter < count; iter  ) {
        Send_r[iter].aa = ((double*)(Send_r   count))   2 *  subCount * iter;
        Send_r[iter].bb = Send_r[iter].aa   subCount;
    }

    // id. for Send_l
    Send_l = malloc(count * sz);
    for (iter = 0; iter < count; iter  ) {
        Send_l[iter].aa = ((double*)(Send_l   count))   2 * subCount * iter;
        Send_l[iter].bb = Send_l[iter].aa   subCount;
    }
}

Here I have first the array of cell structures and then 1 aa array and 1 bb array per structure in that order.

That is enough to get rid of the segmentation fault...

CodePudding user response：

The single global struct

struct cell
{
    double a, b, c, *aa, *bb;
} * Send_l, *Send_r;

is a bit fragile:

aa and bb are allocated as arrays of double but the subCount -1 size is not there. It is buried into the code.
Send_l and Send_r are also pointers to arrays of struct cell but the count size is not there. It is also buried into the code. The single struct is global and it is also weak.

This makes hard to test, allocate or free data. I will left a C example using a bit of encapsulation and that you can adapt to your case under MPI. I will use you code and functions with a bit of OOP orientation :)

A `Cell` structure

typedef struct
{
    double  a;
    double  b;
    double  c;
    double* aa; 
    double* bb;

} Cell;

The `Send` structure

typedef struct
{
    Cell     l;
    Cell     r;

} Send;

The `Set` structure


typedef struct
{
    unsigned count;
    unsigned subCount;
    Send*    send;

} Set;

So a Set has all that is needed to describe its contents.

function prototypes

Set* allocateSendRecv(unsigned, unsigned);
int  fillSendRecv(Set*);
Set* freeSendRecv(Set*);

Using encapsulation and a bit of RAII from C you can rewrite allocateSendRecv() and freeSendRecv() as constructor and destructor of the struct as:

Set* allocateSendRecv(unsigned count, unsigned subCount)
{
    // count is the number of send buffers
    // subcount is the size of the arrays inside each cell
    printf(
        "Allocating(count = %u, subCount = %u)\n", count,
        subCount);
    Set* nw      = (Set*)malloc(sizeof(Set));
    nw->count    = count;
    nw->subCount = subCount;
    nw->send     = (Send*)malloc(count * sizeof(Send));
    // now that we have Send allocate the Cell arrays
    for (unsigned i = 0; i < count; i  )
    {
        nw->send[i].l.aa =
            (double*)malloc(subCount * sizeof(double));
        nw->send[i].l.bb =
            (double*)malloc(subCount * sizeof(double));
        nw->send[i].r.aa =
            (double*)malloc(subCount * sizeof(double));
        nw->send[i].r.bb =
            (double*)malloc(subCount * sizeof(double));
    }
    return nw;
}

Set* freeSendRecv(Set* set)
{
    if (set == NULL) return NULL;
    printf(
        "Deallocating(count = %u, subCount = %u)\n",
        set->count, set->subCount);

    for (unsigned i = 0; i < set->count; i  )
    {
        free(set->send[i].l.aa);
        free(set->send[i].l.bb);
    }
    free(set->send);
    free(set);
    return NULL;
}

`main()` for the example

int main(void)
{
    Set* tst = allocateSendRecv(2, 5);
    fillSendRecv(tst);
    printSendRecv(tst, "printSendRecv():    ");
    tst = freeSendRecv(tst);
    return 0;
}

Writing this way the tst pointer is invalidated in the call to freeSendRecv(). In this case tst is allocated with count and subCount as 2 and 5 and this goes inside the Set. fillSendRecv() uses incremental fill values to make it easy to pinpoint some eventual displacement. printSendRecv() accpets a string for an optional message. Values are printed before and after the creation of the Set.

output of the example

Allocating(count = 2, subCount = 5)
Filling!
        Filling set 1 of 2
        l:      [a,b,c] = [    42.001,    42.002,    42.003]
        aa:     42.004     42.005     42.006     42.007     42.008
        bb:     42.009     42.010     42.011     42.012     42.013

        r:      [a,b,c] = [    42.014,    42.015,    42.016]
        aa:     42.017     42.018     42.019     42.020     42.021
        bb:     42.022     42.023     42.024     42.025     42.026

        Filling set 2 of 2
        l:      [a,b,c] = [    42.027,    42.028,    42.029]
        aa:     42.030     42.031     42.032     42.033     42.034
        bb:     42.035     42.036     42.037     42.038     42.039

        r:      [a,b,c] = [    42.040,    42.041,    42.042]
        aa:     42.043     42.044     42.045     42.046     42.047
        bb:     42.048     42.049     42.050     42.051     42.052

printSendRecv():        Count is 2, subCount is 5
        Set 1 of 2
        l:
        [a,b,c] = [    42.001,    42.002,    42.003]
        aa:     42.004     42.005     42.006     42.007     42.008
        bb:     42.009     42.010     42.011     42.012     42.013

        r:
        [a,b,c] = [    42.014,    42.015,    42.016]
        aa:     42.017     42.018     42.019     42.020     42.021
        bb:     42.022     42.023     42.024     42.025     42.026


        Set 2 of 2
        l:
        [a,b,c] = [    42.027,    42.028,    42.029]
        aa:     42.030     42.031     42.032     42.033     42.034
        bb:     42.035     42.036     42.037     42.038     42.039

        r:
        [a,b,c] = [    42.040,    42.041,    42.042]
        aa:     42.043     42.044     42.045     42.046     42.047
        bb:     42.048     42.049     42.050     42.051     42.052



Deallocating(count = 2, subCount = 5)

The increment is 0.001.

The example in 2 files

a header `v1.h`

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct
{
    double  a;
    double  b;
    double  c;
    double* aa;
    double* bb;

} Cell;

typedef struct
{
    Cell     l;
    Cell     r;

} Send;

typedef struct
{
    unsigned count;
    unsigned subCount;
    Send*    send;

} Set;

main file `v1.c`

#include "v1.h"

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

Set*   allocateSendRecv(unsigned, unsigned);
int    fillSendRecv(Set*);
Set*   freeSendRecv(Set*);
double getNext(void);
int    printCell(Cell*, unsigned, const char*);
int    printSendRecv(Set*, const char*);

int main(void)
{
    Set* tst = allocateSendRecv(2, 5);
    fillSendRecv(tst);
    printSendRecv(tst, "printSendRecv():    ");
    tst = freeSendRecv(tst);
    return 0;
}

Set* allocateSendRecv(unsigned count, unsigned subCount)
{
    // count is the number of send buffers
    // subcount is the size of the arrays inside each cell
    printf(
        "Allocating(count = %u, subCount = %u)\n", count,
        subCount);
    Set* nw      = (Set*)malloc(sizeof(Set));
    nw->count    = count;
    nw->subCount = subCount;
    nw->send     = (Send*)malloc(count * sizeof(Send));
    // now that we have Send allocate the Cell arrays
    for (unsigned i = 0; i < count; i  )
    {
        nw->send[i].l.aa =
            (double*)malloc(subCount * sizeof(double));
        nw->send[i].l.bb =
            (double*)malloc(subCount * sizeof(double));
        nw->send[i].r.aa =
            (double*)malloc(subCount * sizeof(double));
        nw->send[i].r.bb =
            (double*)malloc(subCount * sizeof(double));
    }
    return nw;
}

Set* freeSendRecv(Set* set)
{
    if (set == NULL) return NULL;
    printf(
        "Deallocating(count = %u, subCount = %u)\n",
        set->count, set->subCount);

    for (unsigned i = 0; i < set->count; i  )
    {
        free(set->send[i].l.aa);
        free(set->send[i].l.bb);
    }
    free(set->send);
    free(set);
    return NULL;
}

int fillSendRecv(Set* s)
{
    printf("Filling!\n");
    if (s == NULL) return -1;

    for (unsigned i = 0; i < s->count; i  = 1)
    {
        printf("\tFilling set %u of %u\n", 1   i, s->count);
        // l
        s->send[i].l.a = getNext();
        s->send[i].l.b = getNext();
        s->send[i].l.c = getNext();
        for (unsigned j = 0; j < s->subCount; j  = 1)
            s->send[i].l.aa[j] = getNext();
        for (unsigned j = 0; j < s->subCount; j  = 1)
            s->send[i].l.bb[j] = getNext();
        printCell(&s->send[i].l, s->subCount, "\tl:");

        // r
        s->send[i].r.a = getNext();
        s->send[i].r.b = getNext();
        s->send[i].r.c = getNext();
        for (unsigned j = 0; j < s->subCount; j  = 1)
            s->send[i].r.aa[j] = getNext();
        for (unsigned j = 0; j < s->subCount; j  = 1)
            s->send[i].r.bb[j] = getNext();
        printCell(&s->send[i].r, s->subCount, "\tr:");
    }
    return 0;
}

double getNext(void)
{
    static double ix = 42.;
    ix  = .001;
    return ix;
}

int printCell(Cell* cell, unsigned sz, const char* msg)
{
    printf(
        "%s\t[a,b,c] = [.3f,.3f,.3f]\n", msg,
        cell->a, cell->b, cell->c);
    printf("\taa: ");
    for (unsigned j = 0; j < sz; j  = 1)
        printf(".3f ", cell->aa[j]);
    printf("\n\tbb: ");
    for (unsigned j = 0; j < sz; j  = 1)
        printf(".3f ", cell->bb[j]);
    printf("\n\n");
    return 0;
}

int printSendRecv(Set* s, const char* msg)
{
    if (s == NULL) return -1;
    if (msg != NULL) printf("%s", msg);

    printf(
        "    Count is %u, subCount is %u\n", s->count,
        s->subCount);
    for (unsigned i = 0; i < s->count; i  = 1)
    {
        printf("\tSet %u of %u\n", 1   i, s->count);
        printCell(&s->send[i].l, s->subCount, "\tl:\n");
        printCell(&s->send[i].r, s->subCount, "\tr:\n");
        printf("\n");
    }
    printf("\n");
    return 0;
}

casting the return for `malloc()`

Yes, I always cast the return of malloc() as I and many others do no like anything implicit. And also because malloc() accepts any expression that evaluates to a size an lloking at the expression not always say something about the area. Many times the program allocates data for many structures, some enclosed. This little program has 3. So using the cast works as a reminder for the programmmers of what the program intends to allocate, and can avoid many bugs, since the expression many times is not sufficient to show what is what.
This thing about malloc() and cast comes from the C-FAQ, an old never-updated thing that is a compilation of articles from usenet all dating before 2000. And even in that time people wrote there about the possible reasons to CAST the pointer. One of the reason pro-casting in the C-FAQ is that it could alert the programmer for had forgotten to use an include for stdlib.h. Funny. I mean it:

Suppose that you call malloc but forget to #include <stdlib.h>.
 The compiler is likely to assume that malloc is a function
 returning int, which is of course incorrect, and will lead to trouble

Therefore, the seemingly redundant casts are used by people who are
(a) concerned with portability to all pre-ANSI compilers, or
(b) of the opinion that implicit conversions are a bad thing.

I would add the reason I described above.

A Cell structure

The Send structure

The Set structure