Here's a dataframe I have
Sales Rep order_dates Sales in Dollars
1 John 2010-11-01 25
2 John 2008-03-25 30
3 Alex 2010-11-01 5
4 Alex 2008-03-25 15
5 Jeff 2010-11-01 31
5 Jeff 2008-03-25 2
I'd like to rank across the groups here based on date. For instance, I'd like to rank Jeff as being #1 for 2010-11-01, John as being #2 and Alex as being number 3. Similarly, I'd like John to be #1 for 2008-03-25, Alex to be #2, and Jeff to be number 3. How would I achieve this?
CodePudding user response:
We may use dense_rank
after grouping by 'order_dates'
library(dplyr)
df1 <- df1 %>%
group_by(order_dates = as.Date(order_dates)) %>%
mutate(Rank = dense_rank(-Sales_in_Dollars)) %>%
ungroup
-output
df1
# A tibble: 6 × 4
Sales_Rep order_dates Sales_in_Dollars Rank
<chr> <date> <int> <int>
1 John 2010-11-01 25 2
2 John 2008-03-25 30 1
3 Alex 2010-11-01 5 3
4 Alex 2008-03-25 15 2
5 Jeff 2010-11-01 31 1
6 Jeff 2008-03-25 2 3
data
df1 <- structure(list(Sales_Rep = c("John", "John", "Alex", "Alex",
"Jeff", "Jeff"), order_dates = c("2010-11-01", "2008-03-25",
"2010-11-01", "2008-03-25", "2010-11-01", "2008-03-25"),
Sales_in_Dollars = c(25L,
30L, 5L, 15L, 31L, 2L)), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6"))
CodePudding user response:
You can use rank
:
library(tidyverse)
df %>%
group_by(order_dates) %>%
mutate(Rank = rank(-Sales_in_Dollars))
# A tibble: 6 x 4
# Groups: order_dates [2]
Sales_Rep order_dates Sales_in_Dollars Rank
<chr> <chr> <int> <dbl>
1 John 2010-11-01 25 2
2 John 2008-03-25 30 1
3 Alex 2010-11-01 5 3
4 Alex 2008-03-25 15 2
5 Jeff 2010-11-01 31 1
6 Jeff 2008-03-25 2 3
CodePudding user response:
You can do this:
library(data.table)
setDT(df)[order(-Sales_in_Dollars), rank:=1:.N, order_dates]
Output:
Sales_Rep order_dates Sales_in_Dollars rank
<char> <IDat> <int> <int>
1: John 2010-11-01 25 2
2: John 2008-03-25 30 1
3: Alex 2010-11-01 5 3
4: Alex 2008-03-25 15 2
5: Jeff 2010-11-01 31 1
6: Jeff 2008-03-25 2 3
CodePudding user response:
A native solution with ave()
from stats
:
within(df, {
Rank <- ave(-Sales_in_Dollars, order_dates,
FUN = \(x) rank(x, ties.method = "min"))
})
# Sales_Rep order_dates Sales_in_Dollars Rank
# 1 John 2010-11-01 25 2
# 2 John 2008-03-25 30 1
# 3 Alex 2010-11-01 5 3
# 4 Alex 2008-03-25 15 2
# 5 Jeff 2010-11-01 31 1
# 6 Jeff 2008-03-25 2 3
CodePudding user response:
Update: Many thanks to input from @akrun: We can compact it like:
library(dplyr)
df %>%
group_by(order_dates) %>%
mutate(rank = row_number(-Sales_in_Dollars)) %>%
ungroup()
First answer:
Here is one more. It is not the best one but it is interesting what we can do with row_number()
group_by
arrange
:
library(dplyr)
df %>%
mutate(id_1 = row_number()) %>%
group_by(order_dates) %>%
arrange(desc(Sales_n_Dollars), .by_group = "TRUE") %>%
mutate(Rank = row_number()) %>%
arrange(id_1) %>%
select(-id_1)
Sales_Rep order_dates Sales_n_Dollars Rank
<chr> <chr> <int> <int>
1 John 2010-11-01 25 2
2 John 2008-03-25 30 1
3 Alex 2010-11-01 5 3
4 Alex 2008-03-25 15 2
5 Jeff 2010-11-01 31 1
6 Jeff 2008-03-25 2 3