My data are as follows:
year week site prop
2020 3 A 0.2
2020 3 B 0.1
2020 4 A 0.3
2020 4 B 0.5
2021 3 A 0.9
2021 3 B 0.7
2021 4 A 0.1
2021 4 B 0.8
2021 5 A 0.1
2021 5 B 0.8
I would like to calculate the mean of prop across years, within the same week and for the same site. For example, the prop value at site A in week 3 is 0.2 and 0.9 in 2020 and 2021, respectively. The mean of these two values is 0.55. If there is a week in one year, but not the other, the output under mean would be NA. My desired output is as follows:
year week site prop mean
2020 3 A 0.2 0.55
2020 3 B 0.1 0.4
2020 4 A 0.3 0.2
2020 4 B 0.5 0.065
2021 3 A 0.9 0.55
2021 3 B 0.7 0.4
2021 4 A 0.1 0.2
2021 4 B 0.8 0.065
2021 5 A 0.1 NA
2021 5 B 0.8 NA
Thank you in advance.
CodePudding user response:
I would use the dplyr::group_by()
and summarise approach:
library(dplyr)
data_frame %>% group_by(week, site) %>% summarise(mean = mean(prop))
to get the NA values I would just add rows to your data frame where prop is set to NA:
year week site prop
2020 3 A 0.2
2020 3 B 0.1
2020 4 A 0.3
2020 4 B 0.5
2020 5 A NA
2020 5 B NA
2021 3 A 0.9
2021 3 B 0.7
2021 4 A 0.1
2021 4 B 0.8
2021 5 A 0.1
2021 5 B 0.8