Home > Software engineering >  R: how to add rows in dataframe by manipulating a datetime column
R: how to add rows in dataframe by manipulating a datetime column

Time:10-01

I have a table of reference stock symbol that looked like this

enter image description here

I want to investigate the price change around each "showdate".

For exmaple, I want to add 2 new rows for one existing row, with a showdate -1 and 1 day, essentially inflating the table 3 times. Leaving all other columns empty except making sure the symbol is added correctly.

That is: 
for first row             ZTS 2017-01-09 Buy 
I want to add 2 more rows ZTS 2017-01-08
                          ZTS 2017-01-10

For the next row          ZTS 2016-11-02 Buy 
add 2 more rows           ZTS 2016-11-01 
                          ZTS 2016-11-03 
etc...

I read about add_rows() but the function wont let me. and this post doesn't solve it either Add rows to a data-frame based on values in one of the columns

Can someone do this? Usually I deal only with columns.....

>dput(head(sanitised))
structure(list(Company = c("Agilent Technologies ", "Agilent 
Technologies ", 
"Agilent Technologies ", "Agilent Technologies ", "Agilent 
Technologies ", 
"Alcoa "), symbol = c("A", "A", "A", "A", "A", "AA"), showdate = 
c("2021-08-18", 
"2020-04-16", "2017-11-17", "2017-03-23", "2016-05-13", "2016-10- 
07"
), call = c("Buy", "Buy", "Buy", "Buy", "Buy", "Buy"), 
show_segment = c("Guest Interview", 
"Guest Interview", "Discussed Stock", "Featured Stock", 
"Discussed Stock", 
"Discussed Stock"), call_price = c("$163.02", "$78.75", "$68.79", 
"$53.18", "$42.49", "$31.37"), current_returns = c(" 0.0%", 
" 0.0%", 
" 0.0%", " 0.0%", " 0.0%", " 0.0%"), day_7_returns = c("", "", 
"", " 0.3%", " 5.7%", "-15.7%"), day_14_returns = c("", "", "", 
"-0.1%", "", ""), day_30_returns = c("", "", "", "", " 6.8%", 
"-19.7%")), row.names = c(NA, 6L), class = "data.frame")

CodePudding user response:

Here is one approach using split and map_dfr. I've assumed that the values in most columns will be NA for the added rows. You could of course add other values in add_row.

library(tidyverse)

sanitised %>%
  mutate(showdate = as.Date(showdate, format = '%Y-%m-%d')) %>%
  split(list(sanitised$symbol, sanitised$showdate)) %>%
  map_dfr(function(x) {
    x %>%
      # any other columns you want to keep could go in here as well
      add_row(symbol = x$symbol, showdate = x$showdate - 1) %>%
      add_row(symbol = x$symbol, showdate = x$showdate   1)
  })

CodePudding user response:

Here is a base R approach with a loop assuming "symbol" is the only column copied:

x$showdate <- as.Date(x$showdate)
for(i in rownames(x)){
    x[paste0(i, ".", 1), "showdate"] <- x[i, "showdate"] - 1
    x[paste0(i, ".", 2), "showdate"] <- x[i, "showdate"]   1
    x[paste0(i, ".", 1), "symbol"] <- x[i, "symbol"]
    x[paste0(i, ".", 2), "symbol"] <- x[i, "symbol"]
}
x <- x[order(rownames(x)), ]
  • Related