I have a data set with the following columns: Ticker, Company_name, Forecast_period, Measure, Estimate, Forecast_date.
In the Measure column the value is either EPS or DPS and the Forecast_period is either 1 or 2.
I would like to calculate the mean of EPS1, (mean of Estimate if Measure = EPS & Forecast_period =1) mean of EPS2 (mean of Estimate if Measure = EPS & Forecast_period = 2) and the mean of DPS1 (mean of Estimate if Measure = DPS & Forecast_period =1).
For that I wrote the following code:
IBESdatamean2 <- IBESdata %>%
group_by(Company_name, Ticker, Forecast_period) %>%
summarize(mean[Measure == "EPS" & Forecast_period == 1, 'Estimate'])
I need to group my data by Company_name, Ticker and Forecast_period that I am sure of, however I am not sure if I have written the right code for the mean calculation because I am getting the following error:
Caused by error in `mean[Measure == "EPS" & Forecast_period == 1, "Estimate"]`:
! object of type 'closure' is not subsettable
I want to have a conditional mean calculation as described above and if anyone could help me that would be greatly appreciated.
CodePudding user response:
The summarise function works to implement a function on the entire dataset. In this case we want to filter to a specific subset and then extract a single value.
We can use the filter()
function and then implement a pull()
and mean()
library(dplyr)
IBESdatamean2 <- IBESdata %>%
group_by(Company_name, Ticker, Forecast_period) %>%
filter(Measure == "EPS", Forecast_period == 1) %>%
pull(Estimate) %>%
mean(na.rm = T)
This is how you have written your code.
However, what I think you are actually asking for is the mean of each individual measure and forecast period per company.
In which case it would be appropriate to implement a summarise()
, however your group_by()
is incorrect.
library(dplyr)
IBES_ALL_MEANS <- IBESdata %>%
group_by(Company_name, Forecast_period, Measure) %>%
summarise(Mean_Estimates = mean(Estimate, na.rm = T)
If you wanted to further filter down to a specific company, forecast period or measure you can now do so using the filter()
function on the IBES_ALL_MEANS
dataset.