Home > Software engineering >  How do I rank across groups in R?
How do I rank across groups in R?

Time:06-14

Here's a dataframe I have

       Sales Rep order_dates  Sales in Dollars
1          John  2010-11-01           25
2          John  2008-03-25           30
3          Alex  2010-11-01            5
4          Alex  2008-03-25           15
5          Jeff  2010-11-01           31
5          Jeff  2008-03-25           2

I'd like to rank across the groups here based on date. For instance, I'd like to rank Jeff as being #1 for 2010-11-01, John as being #2 and Alex as being number 3. Similarly, I'd like John to be #1 for 2008-03-25, Alex to be #2, and Jeff to be number 3. How would I achieve this?

CodePudding user response:

We may use dense_rank after grouping by 'order_dates'

library(dplyr)
df1 <- df1 %>%   
   group_by(order_dates = as.Date(order_dates)) %>% 
   mutate(Rank = dense_rank(-Sales_in_Dollars)) %>% 
   ungroup

-output

df1
# A tibble: 6 × 4
  Sales_Rep order_dates Sales_in_Dollars  Rank
  <chr>     <date>                 <int> <int>
1 John      2010-11-01                25     2
2 John      2008-03-25                30     1
3 Alex      2010-11-01                 5     3
4 Alex      2008-03-25                15     2
5 Jeff      2010-11-01                31     1
6 Jeff      2008-03-25                 2     3

data

df1 <- structure(list(Sales_Rep = c("John", "John", "Alex", "Alex", 
"Jeff", "Jeff"), order_dates = c("2010-11-01", "2008-03-25", 
"2010-11-01", "2008-03-25", "2010-11-01", "2008-03-25"), 
Sales_in_Dollars = c(25L, 
30L, 5L, 15L, 31L, 2L)), class = "data.frame", 
row.names = c("1", 
"2", "3", "4", "5", "6"))

CodePudding user response:

You can use rank:

 library(tidyverse)
 df %>%
     group_by(order_dates) %>%
     mutate(Rank = rank(-Sales_in_Dollars))

# A tibble: 6 x 4
# Groups:   order_dates [2]
  Sales_Rep order_dates Sales_in_Dollars  Rank
  <chr>     <chr>                  <int> <dbl>
1 John      2010-11-01                25     2
2 John      2008-03-25                30     1
3 Alex      2010-11-01                 5     3
4 Alex      2008-03-25                15     2
5 Jeff      2010-11-01                31     1
6 Jeff      2008-03-25                 2     3

CodePudding user response:

You can do this:

library(data.table)
setDT(df)[order(-Sales_in_Dollars), rank:=1:.N, order_dates]

Output:

   Sales_Rep order_dates Sales_in_Dollars  rank
      <char>      <IDat>            <int> <int>
1:      John  2010-11-01               25     2
2:      John  2008-03-25               30     1
3:      Alex  2010-11-01                5     3
4:      Alex  2008-03-25               15     2
5:      Jeff  2010-11-01               31     1
6:      Jeff  2008-03-25                2     3

CodePudding user response:

A native solution with ave() from stats:

within(df, {
  Rank <- ave(-Sales_in_Dollars, order_dates,
              FUN = \(x) rank(x, ties.method = "min"))
})

#   Sales_Rep order_dates Sales_in_Dollars Rank
# 1      John  2010-11-01               25    2
# 2      John  2008-03-25               30    1
# 3      Alex  2010-11-01                5    3
# 4      Alex  2008-03-25               15    2
# 5      Jeff  2010-11-01               31    1
# 6      Jeff  2008-03-25                2    3

CodePudding user response:

Update: Many thanks to input from @akrun: We can compact it like:

library(dplyr)

df %>% 
  group_by(order_dates) %>% 
  mutate(rank = row_number(-Sales_in_Dollars)) %>% 
  ungroup()

First answer: Here is one more. It is not the best one but it is interesting what we can do with row_number() group_by arrange:

library(dplyr)

df %>% 
  mutate(id_1 = row_number()) %>% 
  group_by(order_dates) %>% 
  arrange(desc(Sales_n_Dollars), .by_group = "TRUE") %>% 
  mutate(Rank = row_number()) %>% 
  arrange(id_1) %>% 
  select(-id_1)
 Sales_Rep order_dates Sales_n_Dollars  Rank
  <chr>     <chr>                 <int> <int>
1 John      2010-11-01               25     2
2 John      2008-03-25               30     1
3 Alex      2010-11-01                5     3
4 Alex      2008-03-25               15     2
5 Jeff      2010-11-01               31     1
6 Jeff      2008-03-25                2     3
  • Related