For example it looks like this now:
Sample | Col1 | Col2 | Col3 | Col4 | Col5 |
---|---|---|---|---|---|
A | 1 | NA | 2 | 1 | 3 |
B | 1 | 2 | NA | 1 | 5 |
C | 0 | 1 | 5 | NA | 3 |
I want it to look like this:
Sample | Col1 | Col2 | Col3 | Col4 | Col5 |
---|---|---|---|---|---|
A | 1 | NA | 2 | 2 | 3 |
B | 1 | 2 | NA | 2 | 5 |
C | 0 | 1 | 5 | NA | 5 |
CodePudding user response:
df1 <- df
df1[is.na(df1)] <- -Inf
df1[-1] <- matrixStats::rowCummaxs(as.matrix(df1[-1]))* NA^is.na(df[-1])
df1
Sample Col1 Col2 Col3 Col4 Col5
1 A 1 NA 2 2 3
2 B 1 2 NA 2 5
3 C 0 1 5 NA 5
or even:
df1 <- df
df1[is.na(df1)] <- -Inf
df1[-1] <- matrixStats::rowCummaxs(as.matrix(df1[-1]))
is.na(df1) <- is.na(df)
df1
Sample Col1 Col2 Col3 Col4 Col5
1 A 1 NA 2 2 3
2 B 1 2 NA 2 5
3 C 0 1 5 NA 5
CodePudding user response:
We may use cummax
from base R
- loop over subset of dataset i.e. numeric columns([-1]
) by row with apply
(MARGIN = 1
), replace the non-NA elements with the cumulative max of the values and assign back
df[-1] <- t(apply(df[-1], 1, FUN = function(x) {
i1 <- !is.na(x)
x[i1] <- cummax(x[i1])
x}))
-output
> df
Sample Col1 Col2 Col3 Col4 Col5
1 A 1 NA 2 2 3
2 B 1 2 NA 2 5
3 C 0 1 5 NA 5
data
df <- structure(list(Sample = c("A", "B", "C"), Col1 = c(1L, 1L, 0L
), Col2 = c(NA, 2L, 1L), Col3 = c(2L, NA, 5L), Col4 = c(1L, 1L,
NA), Col5 = c(3L, 5L, 3L)), class = "data.frame", row.names = c(NA,
-3L))