Home > Net >  R: Create a variable that uses value from previous row in a for loop
R: Create a variable that uses value from previous row in a for loop

Time:10-20

I am trying to create a variable that takes on a value for a variable (z) when two rows differ on another variable (x). So if row numbers 1 and 2 differs for x (starting on row #2), I would like z to take the value of 1, otherwise 0.

I have tried with different if and if-else sentences based on this question (For Loop that References the Previous Row in R), but it does not give me the desired results.


df <-
  data.frame(
    x = c(1, 1, 2, 0, 0, 0, 0, 1, 1, 2),
    y = c(1, 1, 2, 0, 0, 0, 0, 1, 1, 2),
    z = c(0, 1, 2, 0, 0, 0, 0, 1, 1, 2)
  )

for (i in 2:length(df)) {
  df$z <- ifelse(df$x[i] != df$x[i - 1], 1, 0)
}


for (i in 2:length(df)) {
  if (df$x[i] != df$x[i - 1]) {
    df$z == 1
  } else{
    df$z == 0
  }
}

My expected results are:


df_expected <-
  data.frame(
    x = c(1, 1, 2, 0, 0, 0, 0, 1, 1, 2),
    y = c(1, 1, 2, 0, 0, 0, 0, 1, 1, 2),
    z = c(NA, 1, 1, 1, 0, 0, 0, 1, 0, 1)
  )

Thanks a lot in advance!

CodePudding user response:

Edit If you need to use a for-loop, you could use

df$z <- 0
for (i in 2:nrow(df)) {
  df[i, "z"] <-  (df[i, "x"] != df[i - 1, "x"])
}

The problem with your code is:

df$z == 1

doesn't assign anything, is a logical comparison.


You could use

library(dplyr)

df %>% 
  mutate(z =  (x != lag(x, default = first(x))))

This returns

   x y z
1  1 1 0
2  1 1 0
3  2 2 1
4  0 0 1
5  0 0 0
6  0 0 0
7  0 0 0
8  1 1 1
9  1 1 0
10 2 2 1

CodePudding user response:

Using data.table

library(data.table)
setDT(df)[, z := as.integer(x != shift(x, fill = first(x)))]
  • Related