I am trying to allocate an array of structs, with each struct also containing dynamic arrays. They will later be communicated via MPI_Sendrecv
:
struct cell {
double a, b, c, *aa, *bb;
} *Send_l, *Send_r;
I want Send_l
and Send_r
to have count
number of elements, the arrays aa
and bb
should contain sAS
number of elements. This is all done after MPI_Init
.
void allocateForSendRecv(int count) {
int sAS = 5;
int iter = 0;
Send_l = (struct cell *)malloc(count * (sizeof(struct cell)));
for (iter = 0; iter < count; iter ) {
Send_l[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
Send_l[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
}
//sAS-1, as sizeof(struct cell) already contains a single (double) for aa and bb.
Send_r = (struct cell *)malloc(count * (sizeof(struct cell)));
for (iter = 0; iter < count; iter ) {
Send_r[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
Send_r[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
}
}
With this, I can freely allocate, fill and deallocate, however when I call the following, my results diverge from my reference (using all stack arrays).
MPI_Sendrecv(&(Send_r[0]), count, ..., &(Send_l[0]), count, ...)
I haven't found the exact reason, but posts about similar issues made me assume its due to my non-contiguous memory allocation. Ive tried to solve the problem by using a single malloc call, only to get a segmentation fault when I fill my arrays aa
and bb
:
Send_l = malloc(count * (sizeof(*Send_l)) count *(sizeof(*Send_l) 2 * (sAS - 1) * sizeof(double)));
Send_r = malloc(count * (sizeof(*Send_r)) count *(sizeof(*Send_r) 2 * (sAS - 1) * sizeof(double)));
I have reused some code to allocate 2D arrays and applied it to this struct problem, but haven't been able to make it work. Am I right in assuming that, with a functioning single malloc
call and therefore contiguous memory allocation, my MPI_Sendrecv
would work fine? Alternatively, would using MPI_Type_create_struct
solve my non-contiguous memory problem?
Minimal example (without MPI) of segmentation fault. Using allocateSendRecv
, everything is fine. But the single alloc in allocateInOneSendRecv
gives me issues.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
struct cell {
double a, b, c, *aa, *bb;
} *Send_l, *Send_r;
void allocateSendRecv(int count, int sAS);
void fillSendRecv(int count, int sAS);
void freeSendRecv(int count);
void printSendRecv(int count, int sAS);
void allocateInOneSendRecv(int count, int sAS);
int main(int argc, char *argv[])
{
const int count = 2;
const int sAS = 9;
allocateSendRecv(count, sAS);
//allocateInOneSendRecv(count, sAS);
fillSendRecv(count, sAS);
printSendRecv(count, sAS);
freeSendRecv(count);
return 0;
}
void allocateSendRecv(int count, int sAS) {
int iter = 0;
printf("Allocating!\n");
Send_r = (struct cell *)malloc(count * (sizeof(struct cell)));
for (iter = 0; iter < count; iter ) {
Send_r[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
Send_r[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
}
Send_l = (struct cell *)malloc(count * (sizeof(struct cell)));
for (iter = 0; iter < count; iter ) {
Send_l[iter].aa = (double *)malloc((sAS - 1) * sizeof(double));
Send_l[iter].bb = (double *)malloc((sAS - 1) * sizeof(double));
}
}
void allocateInOneSendRecv(int count, int sAS) {
printf("Allocating!\n");
Send_l = malloc(count * (sizeof(*Send_l)) count *(sizeof(*Send_l) 2 * (sAS - 1) * sizeof(double)));
Send_r = malloc(count * (sizeof(*Send_r)) count *(sizeof(*Send_r) 2 * (sAS - 1) * sizeof(double)));
}
void freeSendRecv(int count) {
int iter = 0;
printf("Deallocating!\n");
free(Send_r);
free(Send_l);
}
void fillSendRecv(int count, int sAS) {
int iter = 0;
int iter2= 0;
double dummyDouble = 5.0;
printf("Filling!\n");
for (iter = 0; iter < count; iter ) {
Send_l[iter].a = dummyDouble;
Send_l[iter].b = dummyDouble;
Send_l[iter].c = dummyDouble;
for (iter2 = 0; iter2 < sAS; iter2 ) {
Send_l[iter].aa[iter2] = dummyDouble;
Send_l[iter].bb[iter2] = dummyDouble;
}
dummyDouble ;
Send_r[iter].a = dummyDouble;
Send_r[iter].b = dummyDouble;
Send_r[iter].c = dummyDouble;
for (iter2 = 0; iter2 < sAS; iter2 ) {
Send_r[iter].aa[iter2] = dummyDouble;
Send_r[iter].bb[iter2] = dummyDouble;
}
dummyDouble ;
}
}
void printSendRecv(int count, int sAS) {
int iter = 0;
printf("Printing!\n");
for (iter = 0; iter < count; iter ) {
printf("%f \n", Send_l[iter].a);
printf("%f \n", Send_l[iter].b);
printf("%f \n", Send_l[iter].c);
printf("%f \n", Send_l[iter].aa[sAS - 1]);
printf("%f \n\n", Send_l[iter].bb[sAS - 1]);
printf("%f \n", Send_r[iter].a);
printf("%f \n", Send_r[iter].b);
printf("%f \n", Send_r[iter].c);
printf("%f \n", Send_r[iter].aa[sAS - 1]);
printf("%f \n\n", Send_r[iter].bb[sAS - 1]);
}
}
CodePudding user response:
Your current problem is that you can only pass the start address of Send_l
(resp. Send_r
). From that point, all memory has to be contiguous and you must know its total size and give it later to MPI_SendRecv
.
But after allocation, you must ensure that aa
and bb
members are correctly initialized to point inside the allocated bloc of memory.
A possible code could be:
void allocateSendRecv(int count, int subCount) {
int iter;
// total size of each struct
size_t sz = sizeof(struct cell) 2 * subCount * sizeof(double);
// one single contiguous allocation
Send_r = malloc(count * sz); // nota: never cast malloc in C language!
// per each cell make aa and bb point into the allocated memory
for (iter = 0; iter < count; iter ) {
Send_r[iter].aa = ((double*)(Send_r count)) 2 * subCount * iter;
Send_r[iter].bb = Send_r[iter].aa subCount;
}
// id. for Send_l
Send_l = malloc(count * sz);
for (iter = 0; iter < count; iter ) {
Send_l[iter].aa = ((double*)(Send_l count)) 2 * subCount * iter;
Send_l[iter].bb = Send_l[iter].aa subCount;
}
}
Here I have first the array of cell
structures and then 1 aa
array and 1 bb
array per structure in that order.
That is enough to get rid of the segmentation fault...
CodePudding user response:
The single global struct
struct cell
{
double a, b, c, *aa, *bb;
} * Send_l, *Send_r;
is a bit fragile:
aa
andbb
are allocated as arrays ofdouble
but thesubCount -1
size is not there. It is buried into the code.Send_l
andSend_r
are also pointers to arrays ofstruct cell
but thecount
size is not there. It is also buried into the code. The singlestruct
is global and it is also weak.
This makes hard to test, allocate or free data. I will left a C
example using a bit of encapsulation and that you can adapt to your case under MPI
. I will use you code and functions with a bit of OOP orientation :)
A Cell
structure
typedef struct
{
double a;
double b;
double c;
double* aa;
double* bb;
} Cell;
The Send
structure
typedef struct
{
Cell l;
Cell r;
} Send;
The Set
structure
typedef struct
{
unsigned count;
unsigned subCount;
Send* send;
} Set;
So a Set
has all that is needed to describe its contents.
function prototypes
Set* allocateSendRecv(unsigned, unsigned);
int fillSendRecv(Set*);
Set* freeSendRecv(Set*);
Using encapsulation and a bit of RAII
from C
you can rewrite allocateSendRecv()
and freeSendRecv()
as constructor and destructor of the struct
as:
Set* allocateSendRecv(unsigned count, unsigned subCount)
{
// count is the number of send buffers
// subcount is the size of the arrays inside each cell
printf(
"Allocating(count = %u, subCount = %u)\n", count,
subCount);
Set* nw = (Set*)malloc(sizeof(Set));
nw->count = count;
nw->subCount = subCount;
nw->send = (Send*)malloc(count * sizeof(Send));
// now that we have Send allocate the Cell arrays
for (unsigned i = 0; i < count; i )
{
nw->send[i].l.aa =
(double*)malloc(subCount * sizeof(double));
nw->send[i].l.bb =
(double*)malloc(subCount * sizeof(double));
nw->send[i].r.aa =
(double*)malloc(subCount * sizeof(double));
nw->send[i].r.bb =
(double*)malloc(subCount * sizeof(double));
}
return nw;
}
Set* freeSendRecv(Set* set)
{
if (set == NULL) return NULL;
printf(
"Deallocating(count = %u, subCount = %u)\n",
set->count, set->subCount);
for (unsigned i = 0; i < set->count; i )
{
free(set->send[i].l.aa);
free(set->send[i].l.bb);
}
free(set->send);
free(set);
return NULL;
}
main()
for the example
int main(void)
{
Set* tst = allocateSendRecv(2, 5);
fillSendRecv(tst);
printSendRecv(tst, "printSendRecv(): ");
tst = freeSendRecv(tst);
return 0;
}
Writing this way the tst
pointer is invalidated in the call to freeSendRecv()
. In this case tst
is allocated with count
and subCount
as 2 and 5 and this goes inside the Set
.
fillSendRecv()
uses incremental fill values to make it easy to pinpoint some eventual displacement. printSendRecv()
accpets a string for an optional message. Values are printed before and after the creation of the Set
.
output of the example
Allocating(count = 2, subCount = 5)
Filling!
Filling set 1 of 2
l: [a,b,c] = [ 42.001, 42.002, 42.003]
aa: 42.004 42.005 42.006 42.007 42.008
bb: 42.009 42.010 42.011 42.012 42.013
r: [a,b,c] = [ 42.014, 42.015, 42.016]
aa: 42.017 42.018 42.019 42.020 42.021
bb: 42.022 42.023 42.024 42.025 42.026
Filling set 2 of 2
l: [a,b,c] = [ 42.027, 42.028, 42.029]
aa: 42.030 42.031 42.032 42.033 42.034
bb: 42.035 42.036 42.037 42.038 42.039
r: [a,b,c] = [ 42.040, 42.041, 42.042]
aa: 42.043 42.044 42.045 42.046 42.047
bb: 42.048 42.049 42.050 42.051 42.052
printSendRecv(): Count is 2, subCount is 5
Set 1 of 2
l:
[a,b,c] = [ 42.001, 42.002, 42.003]
aa: 42.004 42.005 42.006 42.007 42.008
bb: 42.009 42.010 42.011 42.012 42.013
r:
[a,b,c] = [ 42.014, 42.015, 42.016]
aa: 42.017 42.018 42.019 42.020 42.021
bb: 42.022 42.023 42.024 42.025 42.026
Set 2 of 2
l:
[a,b,c] = [ 42.027, 42.028, 42.029]
aa: 42.030 42.031 42.032 42.033 42.034
bb: 42.035 42.036 42.037 42.038 42.039
r:
[a,b,c] = [ 42.040, 42.041, 42.042]
aa: 42.043 42.044 42.045 42.046 42.047
bb: 42.048 42.049 42.050 42.051 42.052
Deallocating(count = 2, subCount = 5)
The increment is 0.001
.
The example in 2 files
a header v1.h
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct
{
double a;
double b;
double c;
double* aa;
double* bb;
} Cell;
typedef struct
{
Cell l;
Cell r;
} Send;
typedef struct
{
unsigned count;
unsigned subCount;
Send* send;
} Set;
main file v1.c
#include "v1.h"
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
Set* allocateSendRecv(unsigned, unsigned);
int fillSendRecv(Set*);
Set* freeSendRecv(Set*);
double getNext(void);
int printCell(Cell*, unsigned, const char*);
int printSendRecv(Set*, const char*);
int main(void)
{
Set* tst = allocateSendRecv(2, 5);
fillSendRecv(tst);
printSendRecv(tst, "printSendRecv(): ");
tst = freeSendRecv(tst);
return 0;
}
Set* allocateSendRecv(unsigned count, unsigned subCount)
{
// count is the number of send buffers
// subcount is the size of the arrays inside each cell
printf(
"Allocating(count = %u, subCount = %u)\n", count,
subCount);
Set* nw = (Set*)malloc(sizeof(Set));
nw->count = count;
nw->subCount = subCount;
nw->send = (Send*)malloc(count * sizeof(Send));
// now that we have Send allocate the Cell arrays
for (unsigned i = 0; i < count; i )
{
nw->send[i].l.aa =
(double*)malloc(subCount * sizeof(double));
nw->send[i].l.bb =
(double*)malloc(subCount * sizeof(double));
nw->send[i].r.aa =
(double*)malloc(subCount * sizeof(double));
nw->send[i].r.bb =
(double*)malloc(subCount * sizeof(double));
}
return nw;
}
Set* freeSendRecv(Set* set)
{
if (set == NULL) return NULL;
printf(
"Deallocating(count = %u, subCount = %u)\n",
set->count, set->subCount);
for (unsigned i = 0; i < set->count; i )
{
free(set->send[i].l.aa);
free(set->send[i].l.bb);
}
free(set->send);
free(set);
return NULL;
}
int fillSendRecv(Set* s)
{
printf("Filling!\n");
if (s == NULL) return -1;
for (unsigned i = 0; i < s->count; i = 1)
{
printf("\tFilling set %u of %u\n", 1 i, s->count);
// l
s->send[i].l.a = getNext();
s->send[i].l.b = getNext();
s->send[i].l.c = getNext();
for (unsigned j = 0; j < s->subCount; j = 1)
s->send[i].l.aa[j] = getNext();
for (unsigned j = 0; j < s->subCount; j = 1)
s->send[i].l.bb[j] = getNext();
printCell(&s->send[i].l, s->subCount, "\tl:");
// r
s->send[i].r.a = getNext();
s->send[i].r.b = getNext();
s->send[i].r.c = getNext();
for (unsigned j = 0; j < s->subCount; j = 1)
s->send[i].r.aa[j] = getNext();
for (unsigned j = 0; j < s->subCount; j = 1)
s->send[i].r.bb[j] = getNext();
printCell(&s->send[i].r, s->subCount, "\tr:");
}
return 0;
}
double getNext(void)
{
static double ix = 42.;
ix = .001;
return ix;
}
int printCell(Cell* cell, unsigned sz, const char* msg)
{
printf(
"%s\t[a,b,c] = [.3f,.3f,.3f]\n", msg,
cell->a, cell->b, cell->c);
printf("\taa: ");
for (unsigned j = 0; j < sz; j = 1)
printf(".3f ", cell->aa[j]);
printf("\n\tbb: ");
for (unsigned j = 0; j < sz; j = 1)
printf(".3f ", cell->bb[j]);
printf("\n\n");
return 0;
}
int printSendRecv(Set* s, const char* msg)
{
if (s == NULL) return -1;
if (msg != NULL) printf("%s", msg);
printf(
" Count is %u, subCount is %u\n", s->count,
s->subCount);
for (unsigned i = 0; i < s->count; i = 1)
{
printf("\tSet %u of %u\n", 1 i, s->count);
printCell(&s->send[i].l, s->subCount, "\tl:\n");
printCell(&s->send[i].r, s->subCount, "\tr:\n");
printf("\n");
}
printf("\n");
return 0;
}
casting the return for malloc()
Yes, I always cast the return of malloc()
as I and many others do no like anything implicit. And also because malloc()
accepts any expression that evaluates to a size an lloking at the expression not always say something about the area. Many times the program allocates data for many structures, some enclosed. This little program has 3. So using the cast works as a reminder for the programmmers of what the program intends to allocate, and can avoid many bugs, since the expression many times is not sufficient to show what is what.
This thing about malloc()
and cast comes from the C-FAQ, an old never-updated thing that is a compilation of articles from usenet all dating before 2000. And even in that time people wrote there about the possible reasons to CAST the pointer.
One of the reason pro-casting in the C-FAQ is that it could alert the programmer for had forgotten to use an include
for stdlib.h
. Funny. I mean it:
Suppose that you call malloc but forget to #include <stdlib.h>.
The compiler is likely to assume that malloc is a function
returning int, which is of course incorrect, and will lead to trouble
Therefore, the seemingly redundant casts are used by people who are
(a) concerned with portability to all pre-ANSI compilers, or
(b) of the opinion that implicit conversions are a bad thing.
I would add the reason I described above.