I need to obtain the last 30 years mean value of TOTAL column month over month.
The dataset is avaible here:
library(dplyr)
ENSO <-read.table("http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt", header = TRUE)
glimpse(ENSO)
For example for the sep-2021 I need to calculate:
$$
(TOTAL_{sep-2021}
TOTAL_{sep-2020}
TOTAL_{sep-2019}
...
TOTAL_{sep-1991}) / 30
$$
I tried to use dplyr::mutate
but I think that slider
or zoo
maybe can be helpful inside an condition because the time series begins in jan-1950 and obviously I wouldn't have the last 30 average values MoMs.
CodePudding user response:
You can either fix the current year to 2021 or get the higher year in the table.
Then, you only need to filter out years that are lower than this current year minus 30.
If you want to sugar-coat this, you can even use a custom name for your column.
Here is the code:
current_y = max(ENSO$YR)
col_name = paste0("total_mean_", current_y-30, "_to_", current_y)
ENSO %>%
filter(YR>current_y-30) %>%
group_by(MON) %>%
summarise(!!col_name:=mean(TOTAL))
# # A tibble: 12 x 2
# MON total_mean_1991_to_2021
# <int> <dbl>
# 1 1 26.5
# 2 2 26.7
# 3 3 27.3
# 4 4 27.8
# 5 5 27.9
# 6 6 27.7
# 7 7 27.3
# 8 8 26.8
# 9 9 26.7
#10 10 26.7
#11 11 26.7
#12 12 26.5
CodePudding user response:
Here is a data.table
approach.
functional:
1 - Split the table to a list of tables by month
2 - caculate the rolling mean of the last 30 months
3 - rowbind the monthly tables to a single table
library(data.table)
# Make ENSO a data.table, key by year and month
setDT(ENSO, key = c("YR", "MON"))
# Split by MON
L <- split(ENSO, by = "MON")
# Loop over L, create monthly mean over the last 30 entries
L <- lapply(L, function(x) {
x[, MON30_avg := frollmean(TOTAL, n = 30)]
})
# Rowbind List together again
final <- rbindlist(L, use.names = TRUE, fill = TRUE)
CodePudding user response:
Thank you Bloxx!
I can use the new variable by using
library(dplyr)
library(fpp3)
ENSO <- read.table("http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt", header = TRUE) %>%
mutate(
Dates = paste(YR, "-", MON),
Dates = yearmonth(Dates),
Month_Year = paste(month.name[month(Dates)],"/", year(Dates)),
diff_total = difference(TOTAL),
ANOM = round( TOTAL - ClimAdjust, digits = 2),
# TMA = TMA_{t-1} TMA_{t} TMA_{t 1}
TMA = round( slide_dbl(ANOM, mean, .before = 1, .after = 1), digits=2 ),
# ´Climatic Condition`= if 5 last consecutives TMA > 0.5 then El Niño, otherwise if 5 last consecutives TMA < -0.5 then La Niña
`Climatic Condition` =
lag( case_when(
rollapplyr(TMA < -0.5, 5, all, fill = FALSE) ~ "La Niña",
rollapplyr(TMA > 0.5, 5, all, fill = FALSE) ~ "El Niño") ),
`3 months` =
case_when(
month(Dates) == 1 ~ "DJF",
month(Dates) == 2 ~ "JFM",
month(Dates) == 3 ~ "FMA",
month(Dates) == 4 ~ "MAM",
month(Dates) == 5 ~ "AMJ",
month(Dates) == 6 ~ "MJJ",
month(Dates) == 7 ~ "JJA",
month(Dates) == 8 ~ "JAS",
month(Dates) == 9 ~ "ASO",
month(Dates) == 10 ~ "SON",
month(Dates) == 11 ~ "OND",
month(Dates) == 12 ~ "NDJ" )
) %>% as_tsibble(index = Dates)
ENSO <- ENSO %>% # To reorder the dtaaframe
select(
Dates,
Month_Year,
YR,
MON,
TOTAL,
ClimAdjust,
ANOM,
TMA,
`3 months`,
`Climatic Condition`,
diff_total
)
ClimAdj <- ENSO %>%
group_by(MON) %>%
summarise(ClimAdj = mean(TOTAL) )
ENSO <- left_join(ENSO, ClimAdj %>%
select(Dates, ClimAdj), by = c("Dates" = "Dates"))
ENSO <- ENSO %>%
select(
-MON.y
) %>%
rename(
MON = "MON.x"
)
ENSO <- ENSO %>%
select(
Dates,
Month_Year,
YR,
MON,
TOTAL,
#ClimAdjust,
ClimAdj,
ANOM,
TMA,
`3 months`,
`Climatic Condition`,
diff_total
)
glimpse(ENSO)
CodePudding user response:
Here is the updated code. You first arrange by year and month and then slice last 360 months (30 years)! Then group by month and then calculate mean:
ENSO %>% arrange(YR, MON) %>% slice_tail(n = 360) %>% group_by(MON) %>% summarise(mean(TOTAL))
Hope this is what you want. Each month has mean for the last 30 years.