Home > Net >  Get column value associated to another column maximum in dplyr's across
Get column value associated to another column maximum in dplyr's across

Time:01-20

After grouping by species and taken max Sepal.Length (column 1) for each group I need to grab the value of column 2 to 4 that are associated to maximum value of column 1 (by group). I'm able to do so for each single column at once but not in an across process. Any tips?

library(dplyr)
library(datasets)
data(iris)

Summarize by species with data associates to max sepal.length (by group), column by column:

iris_summary <- iris %>%
  group_by(Species) %>%
  summarise(
    max_sep_length = max(Sepal.Length),
    sep_w_associated_to = Sepal.Width[which.max(Sepal.Length)],
    pet_l_associated_to = Petal.Length[which.max(Sepal.Length)],
    pet_w_associated_to = Petal.Width[which.max(Sepal.Length)]
  )

Now I would like obtain the same result using across, but the outcome is different from that I expected (the df iris_summary has now same number of rows as iris, I can't understand why...)

iris_summary <- iris %>%
  group_by(Species) %>%
  summarise(
    max_sepa_length = max(Sepal.Length),
    across(
      .cols = Sepal.Width : Petal.Width,
      .funs = ~ .x[which.max(Sepal.Length)]
    )
  )

CodePudding user response:

Or use slice_max

library(dplyr) # devel can have `.by` or use `group_by(Species)`
iris %>% 
   slice_max(Sepal.Length, n = 1, by = 'Species')
Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1          5.8         4.0          1.2         0.2     setosa
2          7.0         3.2          4.7         1.4 versicolor
3          7.9         3.8          6.4         2.0  virginica

CodePudding user response:

in base R you could do:

merge(aggregate(Sepal.Length~Species, iris, max), iris)

     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1     setosa          5.8         4.0          1.2         0.2
2 versicolor          7.0         3.2          4.7         1.4
3  virginica          7.9         3.8          6.4         2.0

CodePudding user response:

If we want to do the same with across, here is one option:

iris %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~ .[which.max(Sepal.Length)]))
  Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
  <fct>             <dbl>       <dbl>        <dbl>       <dbl>
1 setosa              5.8         4            1.2         0.2
2 versicolor          7           3.2          4.7         1.4
3 virginica           7.9         3.8          6.4         2  
  • Related