Home > Enterprise >  Find value changes across rows in R
Find value changes across rows in R

Time:06-23

I have a data frame which looks like the following:

dat <- data.frame(Target = c(rep("01", times = 8), rep("02", times = 4)),
                  targ1clicks = c(1, 1, 1, 1, 0, 0 ,0 , 1, 0, 0, 0, 1))

    Target targ1clicks
1      01           1
2      01           1
3      01           1
4      01           1
5      01           0
6      01           0
7      01           0
8      01           1
9      02           0
10     02           0
11     02           0
12     02           1

What I want to find is how many times the value in the targ1clicks column changes between 1 and 0. So for Target 01, it should be 2; for Target 02 it should be 1. I know there are some similar posts on this which helped me achieve the code I already have (below). The issue is my code is not generating the correct values and I can't figure out why.

Here is the code I have thus far:

clickRows <- which(dat$targ1clicks != dplyr::lag(dat$targ1clicks)) #find the row numbers 
 #where there is a change                            

dat2 <- dat[c(clickRows),] #filter by these row numbers and save as new df

dat2 <- dat2 %>% 
  group_by(Target) %>% 
  filter(targ1clicks == 1) #filter just on the '1' values

no.accClicks <- as.data.frame(table(dat2$Target)) #This should show the number 
 #of value changes for each target  

The trouble is, this code doesn't give me the correct value for Target 01 - it gives me the following:

   Var1 Freq
1   01    1
2   02    1

The Freq column should indicate the number of value changes, so it should have 2 for the 01 target, and I can't figure out why this isn't working correctly! Any insight into why this isn't working or any alternatives on how to code this would be hugely appreciated.

CodePudding user response:

You were on the right track. Use lag to compare with previous values. This will give a vector of TRUE/FALSE (TRUE meaning value changed). Then take a sum of this vector for each group.

library(dplyr)
dat %>% 
    group_by(Target) %>% 
    summarise(changes = sum(targ1clicks != lag(targ1clicks), na.rm = TRUE))

CodePudding user response:

Try this

dat |> group_by(Target) |> 
summarise(Freq = sum(abs(diff(targ1clicks))))

# A tibble: 2 × 2
  Target  Freq
  <chr>  <dbl>
1 01         2
2 02         1
  • Related