Home > OS >  How to make a rank column in R
How to make a rank column in R

Time:03-07

I have a database with columns M1, M2 and M3. These M values correspond to the values obtained by each method. My idea is now to make a rank column for each of them. For M1 and M2, the rank will be from the highest value to the lowest value and M3 in reverse. I made the output table for you to see.

df1<-structure(list(M1 = c(400,300, 200, 50), M2 = c(500,200, 10, 100), M3 = c(420,330, 230, 51)), class = "data.frame", row.names = c(NA,-4L))

> df1
   M1  M2  M3
1 400 500 420
2 300 200 330
3 200 10 230
4  50 100  51

Output

> df1
   M1  rank M2  rank M3 rank
1 400   1   500  1   420  4    
2 300   2   200  2   330  3
3 200   3   10   4   230  2
4  50   4   100  3   51   1

Adjust rankings:

I used the code, but in a case I'm working on, my rankings looked like this: enter image description here

CodePudding user response:

Using rank and relocate:

library(dplyr)

df1 %>% 
  mutate(across(M1:M2, ~ rank(-.x), .names = "{.col}_rank"),
         M3_rank = rank(M3)) %>% 
  relocate(order(colnames(.)))

   M1 M1_rank  M2 M2_rank  M3 M3_rank
1 400       1 500       1 420       4
2 300       2 200       2 330       3
3 200       3  10       4 230       2
4  50       4 100       3  51       1

If you have duplicate values in your vector, then you have to choose a method for ties. By default, you get the average rank, but you can choose "first".

Another possibility, which is I think what you want to do, is to convert to factor and then to numeric, so that you get a only entire values (not the average).

df1 <- data.frame(M1 = c(400,300, 50, 300))
df1 %>% 
  mutate(M1_rankAverage = rank(-M1),
         M1_rankFirst = rank(-M1, ties.method = "first"),
         M1_unique = as.numeric(as.factor(rank(-M1))))

   M1 M1_rankAverage M1_rankFirst M1_unique
1 400            1.0            1         1
2 300            2.5            2         2
3  50            4.0            4         3
4 300            2.5            3         2

CodePudding user response:

What I would do to achieve this result would be to transform each of the df columns into factors, and then convert them into numeric again. This could be accomplished in basic R, but for simplicity I report the tidyverse code:

library(tidyverse)

df1 = df1 %>% 
  mutate(Rank1 = as.numeric(as.factor(M1))) %>% 
  mutate(Rank2 = as.numeric(as.factor(M2))) %>% 
  mutate(Rank3 = as.numeric(fct_rev(as.factor(M3))))

CodePudding user response:

Another possible solution, based on dplyr:

library(dplyr)

df1 %>% 
  mutate(across(-3, ~ dense_rank(desc(.x)), .names = "{.col}_rank"),
         across(3, ~ dense_rank(.x), .names = "{.col}_rank")) %>% 
  relocate(sort(names(.)))

#>    M1 M1_rank  M2 M2_rank  M3 M3_rank
#> 1 400       1 500       1 420       4
#> 2 300       2 200       2 330       3
#> 3 200       3  10       4 230       2
#> 4  50       4 100       3  51       1
  •  Tags:  
  • r
  • Related