Home > database >  How to write dynamically allocated 3D array into hdf5 file in C?
How to write dynamically allocated 3D array into hdf5 file in C?

Time:12-10

I have a dynamically allocated 3D array realized as pointers to arrays of pointers to arrays (at least that is my interpretation of what I am doing) and want to store that data in an hdf5 file. While something is stored in the file, it is not the original data.

Here is my code (the error-checking stuff is left out here):

#include <stdlib.h>
#include <stdio.h>
#include <hdf5.h>

double ***arr3D_d( size_t dim1, size_t dim2, size_t dim3 ) {
    size_t  ii, jj;
    double  ***arr;

    arr = calloc( (size_t)dim1, sizeof(double**) );
    for ( ii=0 ; ii<dim1 ;   ii ) {
        arr[ii] = calloc( (size_t)(dim2*dim3), sizeof(double*) );
        for ( jj=0 ; jj<dim2 ;   jj ) {
            arr[ii][jj] = calloc( (size_t)(dim3), sizeof(double) );
        }
    }
    return arr;
}

int main( int argc, char *argv[] ) {
    size_t  ii, jj, kk,
            dim1, dim2, dim3;
    double  ***arr3D;

    // hdf5 related variables
    hid_t   file_id, dataset_id, dataspace_id;
    hsize_t dims[3];
    herr_t  status;

    dim1    = 2;
    dim2    = 3;
    dim3    = 4;
    arr3D   = arr3D_d( dim1, dim2, dim3 );

    for (ii=0 ; ii<dim1 ;   ii)
        for (jj=0 ; jj<dim2 ;   jj)
            for (kk=0 ; kk<dim3 ;   kk)
                arr3D[ii][jj][kk]   = ii   jj   kk;

    for (ii=0 ; ii<dim1 ;   ii)
        for (jj=0 ; jj<dim2 ;   jj)
            for (kk=0 ; kk<dim3 ;   kk)
                printf( "arr3D[%ld][%ld][%ld] = %f\n", 
                        ii, jj, kk, arr3D[ii][jj][kk] ); 

    // create new file for hdf5 data to be written into
    file_id = H5Fcreate( "data.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT );
    // create simple dataspace for the dataset
    dims[0] = dim1;
    dims[1] = dim2;
    dims[2] = dim3;
    dataspace_id    = H5Screate_simple( 3, dims, NULL );
    // create dataset
    dataset_id      = H5Dcreate( file_id, "dataset", H5T_NATIVE_DOUBLE, dataspace_id, 
        H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT );
    // write the dataset
    status          = H5Dwrite( dataset_id, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL, 
        H5P_DEFAULT, arr3D[0][0] );
    // terminate access and free identifiers
    status          = H5Dclose(dataset_id);
    status          = H5Sclose(dataspace_id);
    status          = H5Fclose(file_id);

    return 0;
}

When I now output the data with h5dump, it reads as follows:

HDF5 "data.h5" {
GROUP "/" {
   DATASET "dataset" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 2, 3, 4 ) / ( 2, 3, 4 ) }
      DATA {
      (0,0,0): 0, 1, 2, 3,
      (0,1,0): 0, 2.42092e-322, 1, 2,
      (0,2,0): 3, 4, 0, 2.42092e-322,
      (1,0,0): 2, 3, 4, 5,
      (1,1,0): 0, 5.58294e-322, 4.64561e-310, 4.64561e-310,
      (1,2,0): 4.64561e-310, 0, 0, 0
      }
   }
}
}

This does not correspond to the arr3D in the code, which is printed to console during run time - the output reads:

arr3D[0][0][0] = 0.000000
arr3D[0][0][1] = 1.000000
arr3D[0][0][2] = 2.000000
arr3D[0][0][3] = 3.000000
arr3D[0][1][0] = 1.000000
arr3D[0][1][1] = 2.000000
arr3D[0][1][2] = 3.000000
arr3D[0][1][3] = 4.000000
arr3D[0][2][0] = 2.000000
arr3D[0][2][1] = 3.000000
arr3D[0][2][2] = 4.000000
arr3D[0][2][3] = 5.000000
arr3D[1][0][0] = 1.000000
arr3D[1][0][1] = 2.000000
arr3D[1][0][2] = 3.000000
arr3D[1][0][3] = 4.000000
arr3D[1][1][0] = 2.000000
arr3D[1][1][1] = 3.000000
arr3D[1][1][2] = 4.000000
arr3D[1][1][3] = 5.000000
arr3D[1][2][0] = 3.000000
arr3D[1][2][1] = 4.000000
arr3D[1][2][2] = 5.000000
arr3D[1][2][3] = 6.000000

As written above, this is not what is written into the hdf5-file. What am I doing wrong?

CodePudding user response:

The reason of the garbage values is storing pointers rather than actual doubles in the HDF5 file. In order to nicely store the contiguously allocated array, the simplest is allocating 3d array using a pointer to Variable Length Array (VLA). Just replace all allocation code with a following line:

 double (*arr3D)[dim2][dim3] = calloc(dim1, sizeof *arr3D);

That's all.

Remember to call free(arr3D) to release it.

Contrary to the popular belief the main reason why VLAs were added to C to simplify handle multidimensional arrays, not for stack allocations of objects of runtime-defined size.

When storing the array using H5Dwrite() pass simply arr3D as the last argument.

The content of hdf5 file after the change is:

HDF5 "data.h5" {
GROUP "/" {
   DATASET "dataset" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 2, 3, 4 ) / ( 2, 3, 4 ) }
      DATA {
      (0,0,0): 0, 1, 2, 3,
      (0,1,0): 1, 2, 3, 4,
      (0,2,0): 2, 3, 4, 5,
      (1,0,0): 1, 2, 3, 4,
      (1,1,0): 2, 3, 4, 5,
      (1,2,0): 3, 4, 5, 6
      }
   }
}
  • Related