Home > other >  dplyr: conditionally rank a column based on a condition of another?
dplyr: conditionally rank a column based on a condition of another?

Time:04-23

Hi suppose I have a table like this. What I want is rank by "percent" however I only want it based on when the cat column is "HIGH" group and ignore the "LOW".

       name  cat Freq percent
1     berry HIGH  259   0.583
2      jack HIGH   45   0.634
3     steve HIGH  331   0.943
4     nadia HIGH  304   0.580
5     jacob HIGH  179   0.844
6     susan HIGH   15   0.833
7  luthered HIGH   14   0.264
8      jane HIGH   99   0.513
9     berry  LOW  185   0.417
10     jack  LOW   26   0.366
11    steve  LOW   20   0.057
12    nadia  LOW  220   0.420
13    jacob  LOW   33   0.156
14    susan  LOW    3   0.167
15 luthered  LOW   39   0.736
16     jane  LOW   94   0.487

I tried doing this but I can't seem to get the condition to work.

temp = structure(list(name = c("berry", "jack", "steve", "nadia", "jacob", 
"susan", "luthered", "jane", "berry", "jack", "steve", "nadia", 
"jacob", "susan", "luthered", "jane"), cat = c("HIGH", "HIGH", 
"HIGH", "HIGH", "HIGH", "HIGH", "HIGH", "HIGH", "LOW", "LOW", 
"LOW", "LOW", "LOW", "LOW", "LOW", "LOW"), Freq = c(259L, 45L, 
331L, 304L, 179L, 15L, 14L, 99L, 185L, 26L, 20L, 220L, 33L, 3L, 
39L, 94L), percent = c(0.583, 0.634, 0.943, 0.58, 0.844, 0.833, 
0.264, 0.513, 0.417, 0.366, 0.057, 0.42, 0.156, 0.167, 0.736, 
0.487)), class = "data.frame", row.names = c(NA, -16L))

I tried doing this but the order is not correct.

temp %>% arrange(desc ( percent), cat =="HIGH" )

if this is correctly ordered the name should be ordered as such: steve jacob susan jack berry nadia jane luthered

thanks in advance.

CodePudding user response:

We may use

temp %>% 
   arrange(replace(rep(n()   1, n()), cat == "HIGH", 
      dense_rank(-percent[cat == "HIGH"])))

Or may also use

temp %>%
  group_by(cat) %>% 
  group_modify(~ .x %>%
  arrange(if(.y$cat == "HIGH") desc(percent) else n()   1 )) %>%
  ungroup

-output

# A tibble: 16 × 4
   cat   name      Freq percent
   <chr> <chr>    <int>   <dbl>
 1 HIGH  steve      331   0.943
 2 HIGH  jacob      179   0.844
 3 HIGH  susan       15   0.833
 4 HIGH  jack        45   0.634
 5 HIGH  berry      259   0.583
 6 HIGH  nadia      304   0.58 
 7 HIGH  jane        99   0.513
 8 HIGH  luthered    14   0.264
 9 LOW   berry      185   0.417
10 LOW   jack        26   0.366
11 LOW   steve       20   0.057
12 LOW   nadia      220   0.42 
13 LOW   jacob       33   0.156
14 LOW   susan        3   0.167
15 LOW   luthered    39   0.736
16 LOW   jane        94   0.487

Or if the 'cat' should be ordered based on the 'percent' values that correspond to 'HIGH'

temp %>% 
  arrange(factor(name, levels = unique(name[cat == "HIGH"
      ][order(dense_rank(-percent[cat == "HIGH"]))])))

-output

       name  cat Freq percent
1     steve HIGH  331   0.943
2     steve  LOW   20   0.057
3     jacob HIGH  179   0.844
4     jacob  LOW   33   0.156
5     susan HIGH   15   0.833
6     susan  LOW    3   0.167
7      jack HIGH   45   0.634
8      jack  LOW   26   0.366
9     berry HIGH  259   0.583
10    berry  LOW  185   0.417
11    nadia HIGH  304   0.580
12    nadia  LOW  220   0.420
13     jane HIGH   99   0.513
14     jane  LOW   94   0.487
15 luthered HIGH   14   0.264
16 luthered  LOW   39   0.736
  • Related