Home > Mobile >  R Within-Group Sorting
R Within-Group Sorting

Time:03-06

I have a dataframe looks like below:

 person    year    Office        Job      rank    
 Harry     2002    Los Angeles   CEO       -1     
 Harry     2006    Boston        CEO       -1     
 Harry     2006    Los Angeles   Advisor   0      
 Harry     2006    Chicago       Chairman  2      
    
 Peter     2001    New York      Director  -1     
 Peter     2001    Chicago       CFO       3      
 Peter     2001    Chicago       CEO       1      

I want to sort the data within the person-year groups. The sorting should be based on the information in the column "rank". For each person-year group with multiple (>1) records, I want to sort the data based on the numeric values in the column "rank" (from small to large) and create a new column called "rank2" to capture the sorted order after sorting (starting from 0).

Thus, the output should look like:

 person    year    Office        Job      rank    rank2
 Harry     2002    Los Angeles   CEO       -1     0
 Harry     2006    Boston        CEO       -1     0
 Harry     2006    Los Angeles   Advisor   0      1
 Harry     2006    Chicago       Chairman  2      2
    
 Peter     2001    New York      Director  -1     0
 Peter     2001    Chicago       CFO       3      2
 Peter     2001    Chicago       CEO       1      1

CodePudding user response:

df %>% 
  arrange(person,year,rank) %>% 
  group_by(person,year) %>% 
  mutate(rank2 = row_number()-1)

data.table option

df[order(person,year,rank),rank2:=0:(.N-1), by=.(person,year)]

CodePudding user response:

Using dense_rank -

library(dplyr)

df %>%
  group_by(person, year) %>%
  mutate(rank2 = dense_rank(rank) - 1) %>%
  ungroup

#  person  year Office     Job       rank rank2
#  <chr>  <int> <chr>      <chr>    <int> <dbl>
#1 Harry   2002 LosAngeles CEO         -1     0
#2 Harry   2006 Boston     CEO         -1     0
#3 Harry   2006 LosAngeles Advisor      0     1
#4 Harry   2006 Chicago    Chairman     2     2
#5 Peter   2001 NewYork    Director    -1     0
#6 Peter   2001 Chicago    CFO          3     2
#7 Peter   2001 Chicago    CEO          1     1

Similarly, in base R we can use rank -

transform(df, rank2 = ave(rank, person, year, FUN = base::rank) - 1)
  • Related