I want to do something like
df1 <- iris %>% distinct(Species, .keep_all = TRUE) %>% group_by(Petal.Width) %>% summarise(Sepal.Length.mean1=mean(Sepal.Length), .groups = "drop")
df2 <- iris %>% distinct(Species, Petal.Width, .keep_all = TRUE) %>% group_by(Petal.Width) %>% summarise(Sepal.Length.mean2 =mean(Sepal.Length), .groups = "drop")
inner_join(df1, df2, by="Petal.Width")
But this is tedious to read because of the repetition. Is it possible to do all in one pipe? I cannot recover the initial dataset after distinct()
so I wonder if there's a replacement to that.
CodePudding user response:
A possible solution is to create first a function and then use it inside pipes:
library(tidyverse)
f <- function(df = iris, var1 = Species, var2 = Petal.Width,
var3 = Sepal.Length, i)
{
x <- enquo(var3)
{{df}} %>%
distinct({{var1}}, .keep_all = TRUE) %>% group_by({{var2}}) %>%
summarise(!!str_c(quo_name(x), ".mean", i , sep = "") := mean({{var3}}),
.groups = "drop")
}
inner_join(f(i = 1), f(i = 2), by="Petal.Width")
#> # A tibble: 3 × 3
#> Petal.Width Sepal.Length.mean1 Sepal.Length.mean2
#> <dbl> <dbl> <dbl>
#> 1 0.2 5.1 5.1
#> 2 1.4 7 7
#> 3 2.5 6.3 6.3
CodePudding user response:
A workaround would be to use an expression with {}
Here is the beginning of the solution
iris %>% {
df1 <- distinct(., Species, .keep_all = TRUE)
df2 <- distinct(., Species, Petal.Width, .keep_all = TRUE)
list(df1, df2)} %>%
map(~ group_by(.x, Petal.Width)) # SOLUTION TO BE COMPLETED