Home > Blockchain >  (base R) How to apply `diff()` to a column and append it as a new data.frame column?
(base R) How to apply `diff()` to a column and append it as a new data.frame column?

Time:08-03

I have a data frame df1 like this:

time Diamond.Hands returns volume close
2021-02-16 10:00:00 0.4583333 0.0056710775 10059 53.20
2021-02-16 11:00:00 0.2352941 -0.0037586920 8664 53.01
2021-02-16 12:00:00 0.4400000 -0.0037586920 10059 52.40
# Log return
prices <- df1$close
log_returns <- diff(log(prices), lag=1)
df1$logreturns <- log_returns 

returns the error:

Fehler in `$<-.data.frame`(`*tmp*`, logreturns, value = c(0.000187952260679136,  :
  Ersetzung hat 2219 Zeilen, Daten haben 2220

Do you have any ideas how to fix that?

CodePudding user response:

When you do

y <- diff(x, lag = m, differences = k)

the resulting vector y has m * k fewer elements than x. If you want to have both x and y as data.frame/matrix columns, you need to pad m * k number of leading NAs to y.

In your case, m = 1 and k = 1, so you need to pad one NA:

df1$logreturns <- c(NA, log_returns)

More concisely, we can pack your 3 lines of code into 1:

df1$logreturns <- c(NA, diff(log(df1$close)))

Remark:

If you want to know how to do mutate() diff() in dplyr, then maybe something like:

df1 %>% mutate(logreturns = c(NA, diff(log(close))))

Here is another possibly related Q & A: Error when using "diff" function inside of dplyr mutate.

  • Related