Home > Software engineering >  Calculate column based on a different column incrementally in a matrix in R
Calculate column based on a different column incrementally in a matrix in R

Time:06-14

Say I have a vector of ages of 100 trees. Then I age those trees up for 5, 10, 15, and 20 years into the future to create a matrix of tree ages for this year and four 5-year planning periods in the future.

But then, I decide to cut some of those trees (only 10 per planning period), documented in a matrix of T/F values where T is harvested and F is not (trees can't be harvested twice).

age.vec <- sample(x = 1:150, size = 100, replace = T) # create our trees
age.mat <- cbind(age.vec, age.vec 5, age.vec   10, age.vec   15, age.vec   20) # grow them up
x.mat <- matrix(data = F, nrow = 100, ncol = 5) # create the empty harvest matrix
x.mat[cbind(sample(1:100, size = 50), rep(1:5, each = 10))] <-  T # 10 trees/year harvested

So then, the ages of trees that are harvested become zero in that year: age.mat[x.mat] <- 0

I then would like to age the harvested trees up again for the following periods. E.g. if a tree were harvested in the first planning period, in the second planning period (5 years later), I want the age of the tree to be 5, then in the third planning period (10 years later), I want the age of the tree to be 10. I have successfully implemented this in the following for loop:

for (i in 2:5){ # we don't need to calculate over the first year
  age.mat[,i]<-age.mat[,i-1] 5L # add 5 to previous year
  age.mat[x.mat[,i],i] <- 0L # reset age of harvested trees to zero
}

This works, however, it is clunky and slow. Is there a way to implement this faster (i.e. without the for loop)? This is something I'm iterating over thousands of times so speed is of the essence!

Thank you!

CodePudding user response:

You can use apply to work on each vector rowwise, then use some logic within the function to adjust the values you need and concatenate the values back together.

Should be about 2 times faster

  age.mat |> 
  apply(1, \(x) { 
    
    if(any(x == 0 & (which(x == 0) != length(x)))) { 
      
      need <- x[which(x == 0):length(x)] 
      
      need <- seq(0, 5 * (length(need) -1 ), 5)
      
      c(x[1:which(x == 0) -1], need )  
    
      } else x
  }) |> t())
     [,1] [,2] [,3] [,4] [,5]
  [1,]  101    0    5   10   15
  [2,]   55   60   65   70   75
  [3,]   23   28   33    0    5
  [4,]    0    5   10   15   20
  [5,]   23   28   33    0    5
  [6,]   84    0    5   10   15
  [7,]   52   57   62    0    5
  [8,]   26   31   36   41    0
  [9,]  114  119  124  129    0
 [10,]   33   38   43   48   53
 [11,]  144  149  154  159  164
 [12,]   19   24   29   34   39
 [13,]   43   48   53   58   63
 [14,]   69   74   79   84   89
 [15,]   98  103  108  113  118
 [16,]  110  115  120  125  130
 [17,]    8   13   18   23   28
 [18,]   16   21   26   31   36
 [19,]    1    6   11   16   21
 [20,]   60   65    0    5   10

CodePudding user response:

Here's an approach relying on the fact that harvesting a tree doesn't stop the passage of time, it just resets the clock. So we can think of a harvest as subtracting the harvest age from all future ages.

x.die <- x.mat * age.mat 
x.dif <- t(apply(x.die, 1, cumsum))
age.mat2 <- age.mat - x.dif

x.die, by multiplying the harvests by the ages, we get the age at each harvest. The next line calculates the cumulative sum of these across each row, and finally we subtract those from the original ages.

I assume your "trees can't be harvested twice" means we won't ever see two TRUEs in one row of x.mat? My code won't work right if there were more than one harvest per tree location.

  • Related