I am writing R, and I want to add a new column WITHOUT using for loop.
Here's the thing I want to do:
I want to calculate the mean from the first value to the current value.
If I use for loop, I will do in this way:
for (i in c(1:nrow(data))){
data$Xn_bar[i] = mean(data$Xn[1:i])
}
Is there other way(i.e. map?)
Here's the data:
a = data.frame(
n = c(1:10),
Xn = c(-0.502,0.132,-0.079,0.887,0.117,0.319,-0.582,0.715,-0.825,-0.360)
)
CodePudding user response:
You can do this with dplyr::cummean()
or calculate it in base R by dividing the cumulative sum by the number of values so far:
cumsum(a$Xn) / seq_along(a$Xn) # base R
dplyr::cummean(a$Xn) # dplyr
# Output in both cases
# [1] -0.50200000 -0.18500000 -0.14966667 0.10950000 0.11100000 0.14566667 0.04171429
# [8] 0.12587500 0.02022222 -0.01780000
CodePudding user response:
Here is one solution using row_number() of dplyr and mapply() function:
library(dplyr)
df=data.frame(n=c(1,2,3,4,5),
Xn=c(-0.502,0.132,-0.079,0.887,0.117))
# add column row_index which contain row_number of current row
df= df%>%
mutate(row_index=row_number())
# Add column Xxn
df$Xxn=mapply(function(x,y)return(
round(
mean(
df$Xn[1:y]),3
)
),
df$Xn,
df$row_index,
USE.NAMES = F)
#now remove row_index column
df= df%>%select(-row_index)
df
# > df
# n Xn Xxn
# 1 1 -0.502 -0.502
# 2 2 0.132 -0.185
# 3 3 -0.079 -0.150
# 4 4 0.887 0.110
# 5 5 0.117 0.111