I have a data frame with one variable, x. I want to create a new variable y which is equal to 1 when x decreases by 2 from its previous value and equal to 0 otherwise. Then I want to create a variable z which holds the value of x when y was last equal to 1. I want the initial value of z to be 0. I haven't been able to figure out how to make z. Any advice?
Here's what I'm trying to obtain (but for about 1000 rows):
x y z
9 0 0
8 0 0
6 1 6
9 0 6
7 1 7
5 1 5
I've tried lags, cum functions in dplyr to no avail.
CodePudding user response:
library(dplyr)
library(tidyr)
df <- data.frame(x = c(9,8,6,10,9,7,5))
df %>%
mutate(y = (lag(x, default = x[1]) - x == 2),
z = ifelse(cumsum(y) > 0 & y == 0, NA, x * y)) %>%
fill(z, .direction = "down")
#> x y z
#> 1 9 0 0
#> 2 8 0 0
#> 3 6 1 6
#> 4 10 0 6
#> 5 9 0 6
#> 6 7 1 7
#> 7 5 1 5
Created on 2022-11-07 by the reprex package (v2.0.1)
CodePudding user response:
One option:
df$y = 0L
df$y[-1] = (diff(df$x) == -2L)
df$z = data.table::nafill(ifelse(df$y == 1L, df$x, NA), "locf", fill = 0L)
# x y z
# 1 9 0 0
# 2 8 0 0
# 3 6 1 6
# 4 9 0 6
# 5 7 1 7
# 6 5 1 5
Reproducible data (please provide next time)
df = data.frame(x = c(9L,8L,6L,9L,7L,5L))
CodePudding user response:
Here's a simple way to do it using dplyr
.
library(dplyr)
tmp = data.frame(x = c(9,8,6,9,7,5))
tmp %>%
mutate(y = ifelse(lag(x) - x == 2, 1, 0)) %>%
mutate(z = ifelse(y == 1, x, lag(x))) %>%
replace(is.na(.), 0)
# output
# x y z
# 1 9 0 0
# 2 8 0 0
# 3 6 1 6
# 4 9 0 6
# 5 7 1 7
# 6 5 1 5