What is the best method for calculating the EWMA returns for every column of my time series? Above all columns we have the returns from Today - 260d (-1Year) until Today -1d.

**The returns are calculated by the division of the close prices between days.

I was using the following function:

ewma.func <- function(rets, lambda) {
    sig.p <- 0
    sig.s <- vapply(rets, function(r) sig.p <<- sig.p*lambda   (r^2)*(1 - lambda), 0)
    return(sqrt(sig.s))
}

but It can only generate the Ewma for 1 column per time, so I also have to do the following:

ewma_col = NULL

    for (w in 1:ncol(df)){
    ewma_col[[w]] = ewma.func(df[,w], lambda = 0.94)
    }
    
    df2 <- do.call(rbind, ewma_col) %>% t()
    colnames(df2) = colnames(df)

Since I have 5 columns from this particular object and have more other 100 objects like these inside a list I'm working with, it becomes very hard and unefficient to have to calculate the Ewma for every column for every object. So I was thinking if there is a simpler way of doing that.

My sampled df:

structure(list(`25079578000106` = c(0.311405806132825, 0.0261260831393884, 
0.126801611077099, -0.201990496952931, -0.169037712385034, -0.372023939507926, 
0.426906935535953, -0.402262040825008, -0.273008284875687, 0.142923301064002, 
0.0522466965776403, 0.491128923749784, 0.547432459279662, -0.00905547394722817, 
0.243408062669914, -0.565142654522788, -0.0284871479379945, 0.141976900522423, 
-0.115634388475883, 0.0858369759953348, 0.252102295598888, -0.130994651044603, 
0.213179273123387, 0, 0.254748840234242, -0.162688137697842, 
0.0670642675686395, 0.409574624973175, 0.11580733826122, 0.152815408000606, 
-0.194192341950838, 0.079688931509736, 0.0390181277907686, 0.0366672406016733, 
-0.0841513321574894, 0.170703395997407, -0.1032803445014, 0.301935098286776, 
0.12983982123842, 0.179888841921638, -0.04270641511539, 0.194911670405418, 
-0.126730582360324, 0.348033349109755, 0.0962079717282904, 0.0734806822947576, 
0.151055897003971, -0.0701511527950061, -0.161361593563925, 0.246798639636836
), `21144577000147` = c(0.402627056610072, 0.0670045021252008, 
0.136672287590045, -0.257998532470083, -0.126993350295379, -0.57979580369647, 
0.493768537307915, -0.491292521383002, -0.403311319223576, 0.130267872918921, 
-0.0309827290038811, 0.617972996951721, 0.486863606965926, 0.0791557540651411, 
0.221599948054063, -0.743017289278214, -0.122417766579019, 0.199045961198863, 
-0.204796549951425, 0.0958513541263528, 0.221446985779039, -0.103656955252518, 
0.242373424043762, 0, 0.320491723141458, -0.187373789685807, 
0.073898113987525, 0.443193321189028, 0.131922088075953, 0.167945069370035, 
-0.218753095850843, 0.0856883381857187, 0.0915706430532737, 0.0253365722528542, 
-0.10242040234516, 0.210685512865894, -0.111825193016557, 0.343926238201675, 
0.145042635631398, 0.169826889032265, -0.085688805575046, 0.256890913078678, 
-0.173078901207191, 0.502885210153181, 0.0139494439281407, 0.111911786007113, 
0.124141056221561, -0.1009381527183, -0.164678661440121, 0.270671359612606
), `19107923000175` = c(1.17081038442848, -1.53767897591024, 
-0.511278352678346, 0.801980435971927, -1.1354756311448, 1.33550018854294, 
-0.877121115991031, 0.893385693962045, -3.05784205729651, 0.790948188478069, 
-0.874211667633062, -2.0517918994301, 0.108547761010414, 1.31951493240194, 
0.59011726098106, 0.751824284998293, -2.542040795106, -1.30722252988562, 
0.166101507966232, -0.333577277251607, -0.48391700402135, -0.287302340893802, 
0.276978237343428, 0, -2.20114477424431, 2.28636453339277, 3.21842714220111, 
0.591201915267447, 1.88892838687025, -2.4835963874466, 0.93808037963754, 
-2.02373054462441, 1.10818007306079, 0.963590860919794, 0.221162120942608, 
0.927865234370984, 1.30669520840456, -1.5475129142942, 1.44346553624928, 
-1.33299447861646, 2.56613694509724, -0.854390492077073, 0.431278918404132, 
-0.419447091917391, 0.437028634769376, -0.279096110807586, 0.702864309823781, 
1.8092529326168, -1.76575759915067, 1.79323091451806), `25079578000106` = c(-1.46258859240334, 
-1.08758898677479, -0.0989607635347056, 0.877778709582344, -1.81190225830505, 
3.65239411476068, -0.591252178764989, 1.21883593492385, 2.9361510378294, 
-1.21526156190157, 5.60858230674057, -0.483417673513031, 3.11737542488117, 
-0.928573480450723, -0.855911339203885, 1.42741011768521, 2.48564664470905, 
3.64030099535739, -0.0133031404402573, 1.84565666459093, 3.33521612974437, 
-0.706821796120494, -1.41998375802359, 0, -1.00702592444577, 
-0.764259576953918, 0.504494091364904, 2.34908743768756, 1.12513038984616, 
0.883990707916382, -0.23625019375686, 0.794114018390246, -1.84599011799946, 
1.00693676176888, -2.68018999058768, 2.1352680909331, -0.361733150930377, 
1.57261038511933, 0.0516994778081425, 1.29365618286101, 1.84691599060898, 
-0.271832695671037, 1.894436301518, 0.0966644805885153, 1.10278020638361, 
-1.48991306559765, 0.533713807271852, 0.703722278376517, -0.931114916329534, 
2.53580948592571), `19436835000117` = c(1.49069022500044, -1.29966042904925, 
-0.395616604691895, 0.727380076932604, -0.30439719239439, 0.550924036724609, 
-0.846017086223583, 0.841084288731508, -2.71310085681762, 0.432345969238668, 
-1.42297721340583, -1.75329706107732, -0.234704765443894, 1.02912636612018, 
0.953879318876716, 0.506016590225045, -2.46852979989853, -1.29307204251745, 
-0.361195165078243, 0.142310472620011, -0.545533438798884, 0.0622563582510338, 
0.664697968204564, 0, -2.65178033678239, 1.65225289310911, 2.5845850508631, 
0.743106457593967, 1.91502897378086, -2.12601029097641, 0.531378326195409, 
-1.64881667524241, 0.658820966236817, 0.782823536428623, -0.430202234929311, 
0.941061544290278, 1.38020377507928, -1.04732682539179, 0.918463659036206, 
-0.891537194911507, 2.72019066906068, -0.480601724302687, 0.65309472320223, 
0.334795709022728, 0.0443713630374987, -0.747195361418562, 0.921720304359042, 
1.04346702937619, -1.57727738560425, 1.28708233642101)), row.names = c("Retorno D - 260", 
"Retorno D - 259", "Retorno D - 258", "Retorno D - 257", "Retorno D - 256", 
"Retorno D - 255", "Retorno D - 254", "Retorno D - 253", "Retorno D - 252", 
"Retorno D - 251", "Retorno D - 250", "Retorno D - 249", "Retorno D - 248", 
"Retorno D - 247", "Retorno D - 246", "Retorno D - 245", "Retorno D - 244", 
"Retorno D - 243", "Retorno D - 242", "Retorno D - 241", "Retorno D - 240", 
"Retorno D - 239", "Retorno D - 238", "Retorno D - 237", "Retorno D - 236", 
"Retorno D - 235", "Retorno D - 234", "Retorno D - 233", "Retorno D - 232", 
"Retorno D - 231", "Retorno D - 230", "Retorno D - 229", "Retorno D - 228", 
"Retorno D - 227", "Retorno D - 226", "Retorno D - 225", "Retorno D - 224", 
"Retorno D - 223", "Retorno D - 222", "Retorno D - 221", "Retorno D - 220", 
"Retorno D - 219", "Retorno D - 218", "Retorno D - 217", "Retorno D - 216", 
"Retorno D - 215", "Retorno D - 214", "Retorno D - 213", "Retorno D - 212", 
"Retorno D - 211"), class = "data.frame")

CodePudding user response：

You can use map_df from the purrr package to do it in one line.

library(dplyr)
library(purrr)

ewma_col <- map_df(df1, ewma.func, lambda = 0.94)
ewma_col
# A tibble: 50 x 5
   `25079578000106` `21144577000147` `19107923000175` `32666326000149` `19436835000117`
              <dbl>            <dbl>            <dbl>            <dbl>            <dbl>
 1           0.0763           0.0986            0.287            0.358            0.365
 2           0.0742           0.0970            0.468            0.438            0.476
 3           0.0784           0.0998            0.471            0.425            0.472
 4           0.0907           0.116             0.497            0.465            0.491
 5           0.0972           0.116             0.556            0.633            0.482
 6           0.131            0.181             0.631            1.08             0.486
 7           0.165            0.213             0.648            1.06             0.515
 8           0.188            0.239             0.666            1.07             0.540
 9           0.194            0.252             0.989            1.26             0.846
10           0.191            0.247             0.978            1.26             0.827

CodePudding user response：

Functional Programming Solution to Calc EWMA

Avoiding the assignment that ignores local scoping <<- is desirable. That little assignment can get missed when refactoring or copy/pasting code.

1. Source example data and load require libraries

Copy pasting the dput output above into a text file allows us to source the file and store the value, a data.frame, in a variable (df).

DPUT_TEXT_FILE <- '/tmp/example_dput.txt'
df <- source(DPUT_TEXT_FILE)$value

We'll use purrr to create a function on the fly, and dplyr will get used to apply a function to every column of the data frame.

library(dplyr)
library(purrr)

Inner calculation

The innermost calculation in the question's body boils down to the following. An important aspect of the original implementation that this eschews (for now) is that the result of this function gets fed back into subsequent calls.

# Calculate single statistic
single_ewa <- function(sig_p, val, lambda){ 
  sig_p*lambda   (val^2)*(1 - lambda)
}

Solve using with partially applied function and an accumulator function.

# Calculate full weighted moving average
calc_ewma <- function(vals, lambda){
  # Partially apply calc single stat function to set lambda
  part_ewa <- partial(single_ewa, lambda=lambda)

  # Reduce and accumulate to get the raw moving results
  raw_result <- Reduce(f=part_ewa, x=vals, init=0, accumulate = TRUE )

  # Square root finishes the calculation
  result <- sqrt(raw_result)

  # Finally, drop the initial condition from the accumulation
  result <- result[2:length(result)]
  return(result)
}

Partially applied function

Using purrr::partial we can set the lambda and get a function returned that no longer needs us to pass it as a parameter. This is also referred to as currying a function. It gets a function with the airity (number of parameters) we're looking for. Namely, the accumulated value sig_p and a new value val.

The accumulator function, `Reduce`

We're using Reduce to start with an initial value of 0, pass it a value from the input vector, run the function that processes those two, and finally accumulate the result for the next iteration. This continues once for each member of the input vector. Using accumulate=TRUE yields a vector of the accumulated values instead of just the final value.

Finally, hack around column names and do the work

Temporarily swapping out column names for enumerated names avoids the dplyr functions from throwing errors about non-unique column names.

# Save old colnames as hack around duplicate column names
old_colnames <- colnames(df)
colnames(df) <- as.character(1:ncol(df))

# Do calculation
ewma_df <- df %>% mutate_all(.funs=calc_ewma, lambda=0.94)

# re-assign colnames
colnames(df) <- old_colnames
colnames(ewma_df) <- old_colnames