Home > Net >  Heap Corruption error when calling C from R, can't find the source issue
Heap Corruption error when calling C from R, can't find the source issue

Time:05-20

UPDATE2: Refactored to use R_alloc instead of calloc for automated cleanup. Unfortunately the problem persists.

UPDATE: If I add this line right before UNPROTECT(1):

Rprintf("%p %p %p", (void *)rans, (void *)fm, (void *)corrs);

then the function executes with no corrupted heap error. Maybe there's a background garbage collection call that corrupts one of the pointers prior to execution finishing, resulting in a write to a garbage pointer? Important to note here that if I don't print out all three of the pointer addresses, the error comes back.

Also I'm running this on an M1 Mac and compiling with clang via R CMD SHLIB, in case Apple silicon is to blame.


I'm at my wits end trying to debug this issue, and I figured I'd turn to SO for help. I'm writing a function in C to optimize some parts of my R code, and I'm getting a Heap Corruption Error when running the function many times. The function trimCovar() is called from R using the .Call("trimCovar", ...) interface.

I'm having a lot of difficulty debugging this for a few reasons:

  1. I'm on OSX, so I can't use Valgrind
  2. C function depends on inputs from R, so I can't debug the C code on its own
  3. Heap corruption only occurs when calling the function many times within an R function (just running .Call directly a bunch of times has no errors)
  4. Error point is inconsistent

I start with two sets of vectors, and I condense them into a frequency matrix, where each column is a position in the vector set, and each row is a particular character that appears. I concatenate them into one matrix prior to passing in because it makes pre-processing easier. An toy example of the frequency matrix would be:

INPUT:             
  v1_1 = 101
  v1_2 = 011

  v2_1 = 111
  v2_2 = 110

Frequency Matrix:

position: | 1_1 | 1_2 | 1_3 | 2_1 | 2_2 | 2_3 |

0:          0.5   0.5   0.0   0.0   0.0   0.5
1:          0.5   0.5   1.0   1.0   1.0   0.5


The goal is to find the NV highest correlated positions across the vector sets, which I do by calculating pairwise KL divergence of positions. These are stored in a linked list sorted in ascending order, and at the end I take the positions corresponding to the first NV entries. The R code I have can deparse everything else, so I really just need a vector of positions at the end (duplicates are allowed).

The function takes in 5 arguments:

  • fMAT: a frequency matrix (RObject, so gets read in as a flat vector)
  • fSP : columns in matrix corresponding to positions from the first vector set
  • sSP : same as fSP but for second vector set
  • NV : Number of values to return
  • NR : Number of columns in fMAT

The error returned is:

R(95564,0x104858580) malloc: Heap corruption detected, free list is damaged at 0x600000f10040
*** Incorrect guard value: 4626885667169763328
R(95564,0x104858580) malloc: *** set a breakpoint in malloc_error_break to debug

This only happens when I run an R function that calls this 10 times, so I'm assuming that I'm just missing one or two small hanging pointers corrupting a memory reference. I've tried running this with gc() called in R immediately after each call, but it doesn't fix the problem. I'm not really sure what else to do at this point, I've tried using lldb but I'm not really sure how to use that program. From running lots of print statements I've determined that it usually crashes in the main loop (identified in code below), but it's inconsistent on when it crashes. I've also tried saving off erroneous inputs--I can rerun them individually with no issues, so it must be something relatively small that only appears over many runs.

Happy to provide more details if it would help. Code is listed at the bottom.

The only thing being allocated here are linked list nodes, and I thought I had free()'d them all prior to returning. I've also double checked the input values, so I'm 99.99% sure that I'm never referencing out of bounds on firstSeqPos, secondSeqPos, ans, or fm. I've also triple checked the R code surrounding this and can confidently say it is not the source of this error.

I haven't coded in C in a long time so I feel like I'm missing something obvious. If I really have to I can try to get ahold of a Linux box to run valgrind, but if there's another option I'd prefer it. Thanks in advance!

Code:

#include <R.h>
#include <Rdefines.h>
#include <Rinternals.h>
#include <math.h>
#include <stdlib.h>
#include <stdbool.h>

typedef struct node {
  double data;
  int i1;
  int i2;
  struct node *next;
} node;

// Linked list
// data is the correlation value,
// i1 the position from first vector set,
// i2 the position from second vector set
node *makeNewNode(double data, int i1, int i2){
  node *newNode; 
  newNode = (node *)R_alloc(1, sizeof(node));
  newNode->data = data;
  newNode->i1 = i1;
  newNode->i2 = i2;
  newNode->next = NULL;
  return(newNode);
}

//insert link in sorted order (ascending)
void insertSorted(node **head, node *toInsert, int maxSize) {
  int ctr = 0;
  if ((*head) == NULL || (*head)->data >= toInsert->data){
    toInsert->next = *head;
    *head = toInsert;
  } else {
    node *temp = *head;
    while (temp->next != NULL && temp->next->data < toInsert->data){
      temp = temp->next;
      if (ctr == maxSize){
        // Performance optimization, if we aren't inserting in the first NR
        // positions then we can just skip since we only care about the NR 
        // lowest scores overall
        return;
      }
      ctr  = 1;
    }
    toInsert->next = temp->next;
    temp->next = toInsert;
  }
}

// MAIN FUNCTION CALLED FROM R
// (This is the one that crashes)
SEXP trimCovar(SEXP fMAT, SEXP fSP, SEXP sSP, SEXP NV, SEXP NR){
  // Converting input SEXPs into C-compatible values
  int nv = asInteger(NV);
  int nr = asInteger(NR);
  int sp1l = length(fSP);
  int sp2l = length(sSP);

  int *firstSeqPos = INTEGER(coerceVector(fSP, INTSXP));
  int *secondSeqPos = INTEGER(coerceVector(sSP, INTSXP));
  double *fm = REAL(fMAT);
  int colv1, colv2;

  // Using a linked list for efficient insert
  node *corrs = NULL;
  int cv1, cv2;
  double p1, p2, score=0;

  // USUALLY FAILS IN THIS LOOP
  for ( int i=0; i<sp1l; i   ){
    cv1 = firstSeqPos[i];
    colv1 = (cv1 - 1) * nr;
    for ( int j=0; j<sp2l; j   ){
      cv2 = secondSeqPos[j];
      colv2 = (cv2 - 1) * nr;

      // KL Divergence
      score = 0;
      for ( int k=0; k<nr; k  ){
        p1 = fm[colv1   k];
        p2 = fm[colv2   k];
        if (p1 != 0 && p2 != 0){
          score  = p1 * log(p1 / p2);
        }
      }
      // Add result into LL
      node *newNode = makeNewNode(score, cv1, cv2);
      insertSorted(&corrs, newNode, nv);
    }
    R_CheckUserInterrupt();
  }
  SEXP ans;
  PROTECT(ans = allocVector(INTSXP, 2*nv));
  int *rans = INTEGER(ans); 
  int ctr=0;
  int pos1, pos2;
  node *ptr = corrs;

  for ( int i=0; i<nv; i  ){
    rans[2*i] = ptr->i1;
    rans[2*i 1] = ptr->i2;
    ptr = ptr->next;
  }

  UNPROTECT(1);
  return(ans);
}

CodePudding user response:

int *firstSeqPos = INTEGER(coerceVector(fSP, INTSXP));
int *secondSeqPos = INTEGER(coerceVector(sSP, INTSXP));

This is not good. The SEXPs returned by the 2 calls to coerceVector() need to be protected. However it's usually considered better practice to do this coercion at the R level right before entering the .Call entry point. Note that if fSP and sSP are integer matrices, there's no need to coerce them to integer as they are already seen as integer vectors at the C level. This also avoids a possibly expensive copy (as.integer() in R and coerceVector() in C both trigger a full copy of the matrix data).

  •  Tags:  
  • r c
  • Related