Home > Software design >  Looping over array of string using huge amounts of memory? (in C)
Looping over array of string using huge amounts of memory? (in C)

Time:07-09

I use a program written in C (not mine) which simulates the chemistry in environments in space. The program is large and complex, but I want to make one very small amendment...

I would like to create an array of 122 different molecules (species), and iterate through them, each time calling a function that retrieves the 'abundance' of that molecule...

void calculate_molecule(int i_x, int i_y){
  
  double **abundance

  char *molecules[122] = {"H", "He", "C", "N", "O"...}      // 122 molecules altogether
 
  for (i=0;i<n_molecules;i  {
    
    abundance = get_abu(molecules[i]);

    cell[i_x][i_y].abu[i] = abundance[i_x][i_y];

    free(abundance);

  }
 

This works, but uses a huge amount of memory! The calculate_molecule() function is called 10,000 times during the run of the program. Is there an issue with creating this large array every time the function is called?

(things like the get_abu() function and n_molecules are defined elsewhere, its only iterating over the array that is my addition)

I have very little understanding of C, only Python, so I'm sure there is some obvious mistake I have made!

CodePudding user response:

You're not freeing abundance correctly; that's why you're seeing a huge memory use.

Based on the declaration of abundance and the abundance[i_x][i_y] expression, it's clear that get_abu allocates a 2D structure that looks something like this:

            ---         ---                              --- 
abundance: |   | ----> |   | abundance[0] ------------> |   | abundance[0][0]
            ---         ---                              --- 
                       |   | abundance[1] ----          |   | abundance[0][1]
                        ---                   |          --- 
                        ...                   |          ...
                                              |
                                              |          --- 
                                               -------> |   | abundance[1][0]
                                                         --- 
                                                        |   | abundance[1][1]
                                                         --- 
                                                         ...

where each "row" is allocated separately. You have to free each "row" of abundance before freeing abundance itself:

for ( size_t k = 0; k < number_of_rows_in_abundance; k   )
  free( abundance[k] );
free( abundance );

This requires you to know how many "rows" are in abundance, though I'm assuming you know that.

A better idea would be to allocate all that memory once for all of the molecules in a separate operation, then get the abundance for each molecule at each index, then free everything. So, as a suggestion, I would create two more functions - one that allocates abundance for all molecules and one that deallocates it, then just pass abundance as an argument to the calculate_molecule function:

double ***calculate_abundance( char **molecules, size_t count )
{
  double ***abundance;

  abundance = malloc( count * sizeof **abundance );
  if ( abundance )
  {
    for ( size_t i = 0; i < count; i   )
      abundance[i] = get_abu( molecules[i] );
  }
  return abundance;
}

void calculate_molecule( int i_x, int i_y, double ***abundance, size_t count )
{
  for ( size_t i = 0; i < count; i   )
    cell[i_x][i_y].abu[i] = abundance[i][i_x][i_y]; 
}

void free_abundance( double ***abundance, size_t count )
{
  for ( size_t i = 0; i < count; i   )
  {
    for ( size_t k = 0; k < rows_for_each_abundance_i; k   )
    {
      free( abundance[i][k] );
    }
    free( abundance[i] );
  }
  free( abundance );
}

So in the code that calls calculate_molecule, you'd do something like this:

 char *molecules[] = { "H", "He", ... };
 size_t count = sizeof molecules / sizeof molecules[0];
 double ***abundance = calculate_abundance( molecules, count );
 if ( abundance )
 {
   /** 
    * do whatever you do that calls calculate_molecule; the call
    * will now look like this:
    */
   calculate_molecule( x, y, abundance, count );
   
   /**
    * When you don't need abundance anymore, deallocate it
    */
   free_abundance( abundance, count );
}
  • Related