I have a dataset that looks as follows:
data <- tribble(
~Date, ~Ticker, ~Close, ~Open,
"1989-09-11","COND",77.3292,77.3292,
"1989-09-12","COND",77.4435,77.4435,
"1989-09-13","COND",76.3118,76.3118,
"1989-09-14","COND",75.5309,75.6344,
"1989-09-15","COND",75.6598,75.4675)
# A tibble: 5 x 4
Date Ticker Close Open
<chr> <chr> <dbl> <dbl>
1 1989-09-11 COND 77.3 77.3
2 1989-09-12 COND 77.4 77.4
3 1989-09-13 COND 76.3 76.3
4 1989-09-14 COND 75.5 75.6
5 1989-09-15 COND 75.7 75.5
The issue with it is that until a certain date, the closing price is identical with the opening price. What I'm trying to do is writing a function that checks if the opening and closing price are the same, and if that's the case, it replaces the opening price with the closing price from the previous row. If applied to the above data, it would transform the data as follows:
# A tibble: 5 x 4
Date Ticker Close Open
<chr> <chr> <dbl> <dbl>
1 1989-09-11 COND 77.3 NA
2 1989-09-12 COND 77.4 77.3
3 1989-09-13 COND 76.3 77.4
4 1989-09-14 COND 75.5 75.6
5 1989-09-15 COND 75.7 75.5
I tried to do it with an if statement, but I'm running into problems as soon as I try to get the value from the previous row in the "Close" column to the current "Open" value.
CodePudding user response:
In dplyr
, it's a simple mutate
with lag
.
library(dplyr)
data %>%
mutate(Open = if_else(Open == Close, lag(Close), Open))
## A tibble: 5 x 4
# Date Ticker Close Open
# <chr> <chr> <dbl> <dbl>
#1 1989-09-11 COND 77.3 NA
#2 1989-09-12 COND 77.4 77.3
#3 1989-09-13 COND 76.3 77.4
#4 1989-09-14 COND 75.5 75.6
#5 1989-09-15 COND 75.7 75.5