I do have an R data frame like this:
city2001 <- c('a', 'b', 'a')
grade2001 <- c(5, 5, 7)
city2002 <- c('b', 'b', 'a')
grade2002 <- c(8, 9, 10)
df <- data.frame(city2001, grade2001, city2002, grade2002)
and would like to return ,
avg_a = 7.333
# from (5 7 10)/3
How is the logic for that? Thanks.
CodePudding user response:
Try
mean(df[,grepl("grade",colnames(df))][df[,grepl("city",colnames(df))]=="a"])
[1] 7.333333
your df (columns) better be sorted.
If you want for all the groups and not just "a"
tapply(
unlist(df[,grepl("grade",colnames(df))]),
unlist(df[,grepl("city",colnames(df))]),
mean
)
a b
7.333333 7.333333
CodePudding user response:
library(tidyverse)
df %>%
pivot_longer(everything(), names_to = c('.value', 'year'),
names_pattern = '(\\D )(\\d )') %>%
group_by(city)%>%
summarise(mean=mean(grade))
# A tibble: 2 x 2
city mean
<chr> <dbl>
1 a 7.33
2 b 7.33
CodePudding user response:
Here is a one-liner via base R,
aggregate(v2 ~ v1,
cbind.data.frame(v1 = stack(df[c(TRUE, FALSE)])$values,
v2 = stack(df[c(FALSE, TRUE)])$values),
mean)
# v1 v2
#1 a 7.333333
#2 b 7.333333
CodePudding user response:
Another possible solution:
library(tidyverse)
map(list(df[1:2], df[3:4]), ~ filter(.x, .x[1] == "a") %>% pull(2)) %>%
unlist %>% mean
#> [1] 7.333333
Or getting means for each city:
library(tidyverse)
map_dfr(list(df[1:2], df[3:4]), ~ `colnames<-`(.x, c("city", "grade"))) %>%
group_by(city) %>%
summarise(means = mean(grade))
#> # A tibble: 2 x 2
#> city means
#> <chr> <dbl>
#> 1 a 7.33
#> 2 b 7.33