After grouping by species and taken max Sepal.Length (column 1) for each group I need to grab the value of column 2 to 4 that are associated to maximum value of column 1 (by group). I'm able to do so for each single column at once but not in an across
process. Any tips?
library(dplyr)
library(datasets)
data(iris)
Summarize by species with data associates to max sepal.length (by group), column by column:
iris_summary <- iris %>%
group_by(Species) %>%
summarise(
max_sep_length = max(Sepal.Length),
sep_w_associated_to = Sepal.Width[which.max(Sepal.Length)],
pet_l_associated_to = Petal.Length[which.max(Sepal.Length)],
pet_w_associated_to = Petal.Width[which.max(Sepal.Length)]
)
Now I would like obtain the same result using across
, but the outcome is different from that I expected (the df iris_summary has now same number of rows as iris, I can't understand why...)
iris_summary <- iris %>%
group_by(Species) %>%
summarise(
max_sepa_length = max(Sepal.Length),
across(
.cols = Sepal.Width : Petal.Width,
.funs = ~ .x[which.max(Sepal.Length)]
)
)
CodePudding user response:
Or use slice_max
library(dplyr) # devel can have `.by` or use `group_by(Species)`
iris %>%
slice_max(Sepal.Length, n = 1, by = 'Species')
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.8 4.0 1.2 0.2 setosa
2 7.0 3.2 4.7 1.4 versicolor
3 7.9 3.8 6.4 2.0 virginica
CodePudding user response:
in base R you could do:
merge(aggregate(Sepal.Length~Species, iris, max), iris)
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1 setosa 5.8 4.0 1.2 0.2
2 versicolor 7.0 3.2 4.7 1.4
3 virginica 7.9 3.8 6.4 2.0
CodePudding user response:
If we want to do the same with across, here is one option:
iris %>%
group_by(Species) %>%
summarise(across(everything(), ~ .[which.max(Sepal.Length)]))
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.8 4 1.2 0.2
2 versicolor 7 3.2 4.7 1.4
3 virginica 7.9 3.8 6.4 2