I have a dataframe that looks something like this:
dist id daytime season
3 1.11 Name1 day summer
4 2.22 Name2 night spring
5 3.33 Name1 day winter
6 4.44 Name3 night fall
I want of summary of dist
by some specific collums in my dataframe.
So far I used a custom function:
summary <- function(x){df %>%
group_by(x) %>%
summarize(min = min(dist),
q1 = quantile(dist, 0.25),
median = median(dist),
mean = mean(dist),
q3 = quantile(dist, 0.75),
max = max(dist))}
And applied it to any specific collumn I wanted at the moment:
summary_ID <- path.summary(id)
I tried it a few weeks ago and would get something like this>
id min q1 median mean q3 max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Name1 0 17.8 310. 788. 1023. 5832.
2 Name2 0 31.7 284. 570. 744. 9578.
3 Name3 0 17.0 325. 721. 1185. 5293.
4 Name4 0 11.9 197. 530. 865. 3476.
5 Name5 0 24.5 94.9 617. 966. 9567.
When I try it now I get an error:
Error in `group_by()`:
! Must group by variables found in `.data`.
✖ Column `x` is not found.
What changed and how do I get around the issue?
CodePudding user response:
Here, we may use {{}}
if the input is unquoted
path_summary <- function(dat, x){
dat %>%
group_by({{x}}) %>%
summarize(min = min(dist),
q1 = quantile(dist, 0.25),
median = median(dist),
mean = mean(dist),
q3 = quantile(dist, 0.75),
max = max(dist))
}
-testing
> path_summary(df, id)
# A tibble: 3 × 7
id min q1 median mean q3 max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Name1 1.11 1.66 2.22 2.22 2.78 3.33
2 Name2 2.22 2.22 2.22 2.22 2.22 2.22
3 Name3 4.44 4.44 4.44 4.44 4.44 4.44
data
df <- structure(list(dist = c(1.11, 2.22, 3.33, 4.44), id = c("Name1",
"Name2", "Name1", "Name3"), daytime = c("day", "night", "day",
"night"), season = c("summer", "spring", "winter", "fall")),
class = "data.frame", row.names = c("3",
"4", "5", "6"))