For loop and if statements to change matrix values dependant on other values in the row-CodePudding

Let's say I have the following matrix,

mat.data <- c(1, NA, NA, NA, 1, NA, 1, 2, 3)
mat <- matrix(mat.data,nrow=3,ncol=3,byrow=TRUE)
mat

     [,1] [,2] [,3]
[1,]    1   NA   NA
[2,]   NA    1   NA
[3,]    1    2    3

Where the columns are sequential measurements of 3 different individuals (the rows), in this case, NA could represent either the person being dead OR a missing value. It can be assumed that if an NA value is only followed by NA values then the person is dead, otherwise, the value is missing.

As such, I am looking to create a for loop with if statements to change the NA values to be 100 if it is missing or 99 if the person is dead. Such that we would end up with the following matrix.

mat1.data <- c(1, 99, 99, 100, 1, 99, 1, 2, 3)
mat1 <- matrix(mat1.data,nrow=3,ncol=3,byrow=TRUE)
mat1
     [,1] [,2] [,3]
[1,]    1   99   99
[2,]  100    1   99
[3,]    1    2    3

I am having an issue categorising the missing values. I am looking for it to equal 100 if the value mat[r,c] is NA and after it in the row are other non-NA values. This is the code I was starting with but unsure what to do for the part after the &&.

mat1 <- matrix()
for (x in 1:nrow(mat)) {
  for (y in 1:ncol(mat)) {
    if (is.na(mat[x,y]) && (!is.na(mat[x,y c(0:(ncol(mat)-y))]))){
      mat1[x,y] = 100
    }
    else if(is.na(mat[x,y])){
      mat1[x,y] = 99
    }
    else
      mat1[x,y] = mat[x,y]
    
  }
  
}

CodePudding user response：

We could mess around with the cumsums of the absolute is.na-differences after reversing a row.

f <- \(x) {
  dead <- rev(cumsum(abs(diff(c(FALSE, rev(is.na(x))))))) == 1
  x[dead] <- 99
  x[is.na(x) & !dead] <- 100
  x
  }

t(apply(mat, 1, f))
#      [,1] [,2] [,3]
# [1,]    1   99   99
# [2,]  100    1   99
# [3,]    1    2    3

Or if you prefer the for loop:

for (i in seq_len(nrow(mat))) {
    dead <- rev(cumsum(abs(diff(c(FALSE, rev(is.na(mat[i, ]))))))) == 1
    mat[i, dead] <- 99
    mat[i, is.na(mat[i, ]) & !dead] <- 100
}
mat
#      [,1] [,2] [,3]
# [1,]    1   99   99
# [2,]  100    1   99
# [3,]    1    2    3

CodePudding user response：

This ended up not being as nice as I would have liked, but this should work

na.ends <- lapply(apply(is.na(mat), 1, rle), function(x) {
  last <- length(x$values)
  if (x$values[last]==1) {
    sum(x$lengths) - x$lengths[last]:1   1
  } else {
    numeric(0)
  }
})
dead.pos <- do.call("rbind", Map(function(x, y) if (length(y)>0) cbind(x,y) else NULL, seq_along(na.ends), na.ends))
mat[dead.pos] <- 99
mat[is.na(mat)] <- 100

The idea is that we use rle to calculate runs of NA values and then find all the ones specifically at the end of the rows. We then grab the indexes of those values and assign them all to 100 in one go using matrix indexing. Then any remaining NA values in the matrix should be non-99 values so we can just fill the remaining with 100.