Groupby, filter, summarise and then apply the result to the whole column-CodePudding

I am working with R and I have a series x at quarterly frequency and for which I want to extract the mean over the four quarters in 2012 and store that value in all rows of a newly created column. I have this kind of dataset

date durabl services 
2011-10-01 56.7 37.1
2012-01-01 68.1 90.6
2012-04-01 34.1 29.1
2012-07-01 22.56 34.12
2012-10-01 44.89 66.8

And I want to get to this

date durabl services base_durabl base_services
2011-10-01 56.7 37.1 42.4125 55.155
2012-01-01 68.1 90.6 42.4125 55.155
2012-04-01 34.1 29.1 42.4125 55.155
2012-07-01 22.56 34.12 42.4125 55.155
2012-10-01 44.89 66.8 42.4125 55.155

I tried using dplyr

df %>% group_by(year(date)) %>% filter(year(date)==2012) %>% mutate(base_durabl= mean(durabl),base_services=mean(services))

or with summarise but of course I just get a smaller tibble.

CodePudding user response：

You can simply index the rows of services and durabl for which you want the mean estimated:

df %>% 
  mutate(base_durabl = mean(durabl[year(date)==2012]),
         base_services = mean(services[year(date)==2012]))

Output:

         date durabl services base_durabl base_services
1: 2011-10-01  56.70    37.10     42.4125        55.155
2: 2012-01-01  68.10    90.60     42.4125        55.155
3: 2012-04-01  34.10    29.10     42.4125        55.155
4: 2012-07-01  22.56    34.12     42.4125        55.155
5: 2012-10-01  44.89    66.80     42.4125        55.155

CodePudding user response：

Here is an alternative way to @langtang's comprehensibly answer:

library(dplyr)

df %>% 
  mutate(across(-date, ~mean(.[grep("2012", date)], na.rm=TRUE), .names = "base_{col}"))

        date durabl services base_durabl base_services
1 2011-10-01  56.70    37.10     42.4125        55.155
2 2012-01-01  68.10    90.60     42.4125        55.155
3 2012-04-01  34.10    29.10     42.4125        55.155
4 2012-07-01  22.56    34.12     42.4125        55.155
5 2012-10-01  44.89    66.80     42.4125        55.155