Change subsequent row values to previous maximum value up to that point if subsequent values are low-CodePudding

For example it looks like this now:

Sample	Col1	Col2	Col3	Col4	Col5
A	1	NA	2	1	3
B	1	2	NA	1	5
C	0	1	5	NA	3

I want it to look like this:

Sample	Col1	Col2	Col3	Col4	Col5
A	1	NA	2	2	3
B	1	2	NA	2	5
C	0	1	5	NA	5

CodePudding user response：

df1 <- df
df1[is.na(df1)] <- -Inf
df1[-1] <- matrixStats::rowCummaxs(as.matrix(df1[-1]))* NA^is.na(df[-1])
df1
  Sample Col1 Col2 Col3 Col4 Col5
1      A    1   NA    2    2    3
2      B    1    2   NA    2    5
3      C    0    1    5   NA    5

or even:

df1 <- df
df1[is.na(df1)] <- -Inf
df1[-1] <- matrixStats::rowCummaxs(as.matrix(df1[-1]))
is.na(df1) <- is.na(df)
df1
  Sample Col1 Col2 Col3 Col4 Col5
1      A    1   NA    2    2    3
2      B    1    2   NA    2    5
3      C    0    1    5   NA    5

CodePudding user response：

We may use cummax from base R - loop over subset of dataset i.e. numeric columns([-1]) by row with apply (MARGIN = 1), replace the non-NA elements with the cumulative max of the values and assign back

df[-1] <- t(apply(df[-1], 1, FUN = function(x) {
            i1 <- !is.na(x)
            x[i1] <- cummax(x[i1])
   x}))

-output

> df
  Sample Col1 Col2 Col3 Col4 Col5
1      A    1   NA    2    2    3
2      B    1    2   NA    2    5
3      C    0    1    5   NA    5

data

df <- structure(list(Sample = c("A", "B", "C"), Col1 = c(1L, 1L, 0L
), Col2 = c(NA, 2L, 1L), Col3 = c(2L, NA, 5L), Col4 = c(1L, 1L, 
NA), Col5 = c(3L, 5L, 3L)), class = "data.frame", row.names = c(NA, 
-3L))