Given a data.frame like the one below, how do I find the kth largest value for each row from a range of columns:
Input:
ID | i1 | i2 | i3 |
---|---|---|---|
1 | NA | NA | NA |
1 | 0 | NA | NA |
1 | 1 | 0 | NA |
1 | 1 | 1 | NA |
1 | 0 | 1 | NA |
1 | 0 | 1 | 2 |
Desired Output:
ID | i1 | i2 | i3 | k=1 | k=2 | k=3 |
---|---|---|---|---|---|---|
1 | NA | NA | NA | NA | NA | NA |
1 | 0 | NA | NA | 0 | NA | NA |
1 | 1 | 0 | NA | 0 | 1 | NA |
1 | 1 | 1 | NA | 1 | 1 | NA |
1 | 0 | 1 | NA | 0 | 1 | NA |
1 | 2 | 1 | 0 | 0 | 1 | 2 |
I would like a solution that works with three columns like above (i1,i2,i3) but can also work for a dataframe with say 6 columns named (i1,i2,i3,i4,i5,i6) and more potentially. Assume these columns are always adjacent to each other and named like how I have them
CodePudding user response:
If it is to sort by row and create new columns
m1 <- t(apply(df1[-1], 1, function(x) sort(x, na.last = TRUE)))
df1[paste0("k=", seq_len(ncol(m1)))] <- m1
-output
> df1
ID i1 i2 i3 k=1 k=2 k=3
1 1 NA NA NA NA NA NA
2 1 0 NA NA 0 NA NA
3 1 1 0 NA 0 1 NA
4 1 1 1 NA 1 1 NA
5 1 0 1 NA 0 1 NA
6 1 0 1 2 0 1 2
data
df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L), i1 = c(NA, 0L,
1L, 1L, 0L, 0L), i2 = c(NA, NA, 0L, 1L, 1L, 1L), i3 = c(NA, NA,
NA, NA, NA, 2L)), class = "data.frame", row.names = c(NA, -6L
))