I am working with R and I have a series x at quarterly frequency and for which I want to extract the mean over the four quarters in 2012 and store that value in all rows of a newly created column. I have this kind of dataset
date durabl services
2011-10-01 56.7 37.1
2012-01-01 68.1 90.6
2012-04-01 34.1 29.1
2012-07-01 22.56 34.12
2012-10-01 44.89 66.8
And I want to get to this
date durabl services base_durabl base_services
2011-10-01 56.7 37.1 42.4125 55.155
2012-01-01 68.1 90.6 42.4125 55.155
2012-04-01 34.1 29.1 42.4125 55.155
2012-07-01 22.56 34.12 42.4125 55.155
2012-10-01 44.89 66.8 42.4125 55.155
I tried using dplyr
df %>% group_by(year(date)) %>% filter(year(date)==2012) %>% mutate(base_durabl= mean(durabl),base_services=mean(services))
or with summarise but of course I just get a smaller tibble.
CodePudding user response:
You can simply index the rows of services
and durabl
for which you want the mean estimated:
df %>%
mutate(base_durabl = mean(durabl[year(date)==2012]),
base_services = mean(services[year(date)==2012]))
Output:
date durabl services base_durabl base_services
1: 2011-10-01 56.70 37.10 42.4125 55.155
2: 2012-01-01 68.10 90.60 42.4125 55.155
3: 2012-04-01 34.10 29.10 42.4125 55.155
4: 2012-07-01 22.56 34.12 42.4125 55.155
5: 2012-10-01 44.89 66.80 42.4125 55.155
CodePudding user response:
Here is an alternative way to @langtang's comprehensibly answer:
library(dplyr)
df %>%
mutate(across(-date, ~mean(.[grep("2012", date)], na.rm=TRUE), .names = "base_{col}"))
date durabl services base_durabl base_services
1 2011-10-01 56.70 37.10 42.4125 55.155
2 2012-01-01 68.10 90.60 42.4125 55.155
3 2012-04-01 34.10 29.10 42.4125 55.155
4 2012-07-01 22.56 34.12 42.4125 55.155
5 2012-10-01 44.89 66.80 42.4125 55.155