Home > database >  Trying to add an extra column in the data set to calculate daily percentage returns of each asset
Trying to add an extra column in the data set to calculate daily percentage returns of each asset

Time:10-19

My data set looks like this:

symbol  date    adjusted
BAC 2000-01-03  13.61120
BAC 2000-01-04  12.80331
BAC 2000-01-05  12.94381
BAC 2000-01-06  14.05027
BAC 2000-01-07  13.68145
BAC 2000-01-10  13.20725

Under symbol, there are three different stocks. I want to add a column with the daily returns on the asset, but I am stuck on what to do.

CodePudding user response:

Assuming that daily return is the difference between the previous day and current day, this should give you what you want:

library(dplyr)

df %>% 
  group_by(symbol) %>% 
  mutate(return = adjusted - lag(adjusted, 1))

# A tibble: 6 × 4
# Groups:   symbol [1]
  symbol date       adjusted return
  <chr>  <date>        <dbl>  <dbl>
1 BAC    2000-01-03     13.6 NA    
2 BAC    2000-01-04     12.8 -0.808
3 BAC    2000-01-05     12.9  0.140
4 BAC    2000-01-06     14.1  1.11 
5 BAC    2000-01-07     13.7 -0.369
6 BAC    2000-01-10     13.2 -0.474

CodePudding user response:

Simulating data with multiple symbol values; then, as Just James' comment suggests, apply the diff function to each group separately using tapply. You need to add one NA at the start since you can't calculate a return for the very first value.

df <- structure(list(symbol = c("BAC", "BAC", "BAC", "BAC", "BAC", 
"BAC", "CAB", "CAB", "CAB", "CAB", "CAB", "CAB", "ACB", "ACB", 
"ACB", "ACB", "ACB", "ACB"), date = c("2000-01-03", "2000-01-04", 
"2000-01-05", "2000-01-06", "2000-01-07", "2000-01-10", "2000-01-03", 
"2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", "2000-01-10", 
"2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-07", 
"2000-01-10"), adjusted = c(13.6112, 12.80331, 12.94381, 14.05027, 
13.68145, 13.20725, 13.6112, 12.80331, 12.94381, 14.05027, 13.68145, 
13.20725, 13.6112, 12.80331, 12.94381, 14.05027, 13.68145, 13.20725
)), class = "data.frame", row.names = c(NA, -18L))

df$returns <- unlist(tapply(df$adjusted, df$symbol, function(adj) c(NA, diff(adj))))
df
   symbol       date adjusted  returns
1     BAC 2000-01-03 13.61120       NA
2     BAC 2000-01-04 12.80331 -0.80789
3     BAC 2000-01-05 12.94381  0.14050
4     BAC 2000-01-06 14.05027  1.10646
5     BAC 2000-01-07 13.68145 -0.36882
6     BAC 2000-01-10 13.20725 -0.47420
7     CAB 2000-01-03 13.61120       NA
8     CAB 2000-01-04 12.80331 -0.80789
9     CAB 2000-01-05 12.94381  0.14050
10    CAB 2000-01-06 14.05027  1.10646
11    CAB 2000-01-07 13.68145 -0.36882
12    CAB 2000-01-10 13.20725 -0.47420
13    ACB 2000-01-03 13.61120       NA
14    ACB 2000-01-04 12.80331 -0.80789
15    ACB 2000-01-05 12.94381  0.14050
16    ACB 2000-01-06 14.05027  1.10646
17    ACB 2000-01-07 13.68145 -0.36882
18    ACB 2000-01-10 13.20725 -0.47420
  • Related