I would like to order a table using different variable as following:
- increasing value taking into account the three variables at a same time and not increasing value for the first variable then for the second variable and finally for the last variable as we can do with the function arrange(col1, col2, col3).
Below is an example of what I mean. If not clear enough don't hesitate to ask more detail In the example line 5 (19,5,10) is before line 6 (6, NA, NA) because 5 is lower than 6
set.seed(123)
data=data.frame(col1=sample(1:20,10), col2=c(sample(1:20,5), NA, NA, NA, NA, NA), col3=c(sample(1:20,5), NA, NA, NA, NA, NA))
col1 col2 col3
1 15 14 7
2 19 5 10
3 14 9 9
4 3 3 4
5 10 8 14
6 2 NA NA
7 6 NA NA
8 11 NA NA
9 5 NA NA
10 4 NA NA
> data_output
col1 col2 col3
1 2 NA NA
2 3 3 4
3 4 NA NA
4 5 NA NA
5 19 5 10
6 6 NA NA
7 15 14 7
8 10 8 14
9 14 9 9
10 11 NA NA
Any idea ?
thank you
CodePudding user response:
Take the min
imum value per row, and order
it:
data[order(apply(data, 1, min, na.rm = T)), ]
or, with pmin
:
data[order(do.call(pmin, c(data, na.rm = TRUE))), ]
or with matrixStats::rowMins
:
library(matrixStats)
data[order(rowMins(as.matrix(data), na.rm = T)), ]
col1 col2 col3
6 2 NA NA
4 3 3 4
10 4 NA NA
2 19 5 10
9 5 NA NA
7 6 NA NA
1 15 14 7
5 10 8 14
3 14 9 9
8 11 NA NA
CodePudding user response:
The core idea would be to arrange by the minimum value per row. Here using dplyr
:
library(dplyr)
data |>
rowwise() |>
mutate(min_sort = min(c_across(everything()), na.rm = TRUE)) |>
ungroup() |>
arrange(min_sort) # |>
#select(-min_sort)
Output:
# A tibble: 10 × 4
col1 col2 col3 min_sort
<int> <int> <int> <int>
1 2 NA NA 2
2 3 3 4 3
3 4 NA NA 4
4 19 5 10 5
5 5 NA NA 5
6 6 NA NA 6
7 15 14 7 7
8 10 8 14 8
9 14 9 9 9
10 11 NA NA 11
Update: @Maël beat me with 18 sec, but now you have the same core idea expressed in different ways.