If I have a data frame that I am going through using dplyr, how can I get the cummean of the last 2 rows before the last row?
For example...
------
|value|
-------
| 1 |
------
| 2 |
-------
| 3 |
------
| 4 |
-------
In this case, when the code goes through the value 4, I want the cummean to calculate (2 3)/2.
I am thinking something along the lines of this--
new_df -> df %>%
summarise(
roll_mean = cummean(value,-2)
)
However I know this does not work
CodePudding user response:
Assuming that you mean is at each position calculate the mean of the prior two elements of value
, use rollapply
with a width of list(-seq(2))
which means that at each point use the offsets -1 and -2, i.e. one and two positions earlier.
library(dplyr, exclude = c("filter", "lag"))
library(zoo)
DF %>% mutate(mean = rollapply(value, list(-seq(2)), mean, fill = NA))
## value mean
## 1 1 NA
## 2 2 NA
## 3 3 1.5
## 4 4 2.5
A variation is:
DF %>% mutate(mean = (rollsumr(value, 3, fill = NA) - value) / 2)
Note
Input in reproducible form:
DF <- data.frame(value = 1:4)
CodePudding user response:
[Using this data: df <- data.frame(value = 1:4)
]
For two prior values, you could do it manually like so:
df %>%
mutate(roll_mean = (lag(value) lag(value,2))/2)
# value roll_mean
#1 1 NA
#2 2 NA
#3 3 1.5
#4 4 2.5
Or if the window should be bigger, you could use the difference of two cumsum
values:
df %>%
mutate(cuml = cumsum(value),
roll_mean = (lag(cuml) - lag(cuml,3,default=0))/2)
# value cuml roll_mean
#1 1 1 NA
#2 2 3 0.5
#3 3 6 1.5
#4 4 10 2.5
Or I like slider:slide_*
for window functions. Here we ask for the mean for a window that starts two elements prior (.before = 2
) and ends one element prior (after = -1
).
df %>%
mutate(roll_mean = slider::slide_dbl(value, mean, .before = 2, .after = -1))
# value roll_mean
#1 1 NaN
#2 2 1.0
#3 3 1.5
#4 4 2.5
By default slider
allows incomplete windows; if we want the same output we could use:
df %>%
mutate(roll_mean = slider::slide_dbl(value, mean, .before = 2, .after = -1, .complete = TRUE))
# value roll_mean
#1 1 NA
#2 2 NA
#3 3 1.5
#4 4 2.5