Home > OS >  Sorting dataframe first by a column based on a list then by ascending numbers in another column
Sorting dataframe first by a column based on a list then by ascending numbers in another column

Time:10-09

set.seed(42)
df <- data.frame(letters=c(rep('data', 5), rep('oh', 5), rep('yeah', 5), rep('silly', 5)),
                 numbers=runif(n = 20, min = 1, max = 10))

I know I can sort by letters col alphabetical then numbers col numeric like this:

 df[with(df, order(letters, numbers)), ]

thats close, but I want to force letters col to first be sorted by this order c('silly', 'data', 'oh', 'yeah')

how to do this?

CodePudding user response:

We can use match

df[with(df, order(match(letters, c('silly', 'data', 'oh', 'yeah')), numbers)),]

-output

 letters  numbers
18   silly 2.057386
19   silly 5.274974
20   silly 6.042995
16   silly 9.460131
17   silly 9.804038
3     data 3.575256
5     data 6.775710
4     data 8.474029
1     data 9.233254
2     data 9.433679
8       oh 2.211999
6       oh 5.671864
9       oh 6.912931
10      oh 7.345583
7       oh 7.629295
14    yeah 3.298859
11    yeah 5.119676
15    yeah 5.160635
12    yeah 7.472010
13    yeah 9.412050

Or another option is factor with levels specified in the order

df[with(df, order(factor(letters, levels = c('silly', 'data', 'oh', 'yeah')), numbers)),]

CodePudding user response:

Here is a dplyr solution using also match as provided by akrun:

library(dplyr)
df %>% 
  arrange(match(letters, c('silly', 'data', 'oh', 'yeah')), numbers)
       letters  numbers
1    silly 2.057386
2    silly 5.274974
3    silly 6.042995
4    silly 9.460131
5    silly 9.804038
6     data 3.575256
7     data 6.775710
8     data 8.474029
9     data 9.233254
10    data 9.433679
11      oh 2.211999
12      oh 5.671864
13      oh 6.912931
14      oh 7.345583
15      oh 7.629295
16    yeah 3.298859
17    yeah 5.119676
18    yeah 5.160635
19    yeah 7.472010
20    yeah 9.412050
  •  Tags:  
  • r
  • Related