Home > Software design >  extract duplicate row based on condition across column in R
extract duplicate row based on condition across column in R

Time:01-19

I'm stuck trying to keep row based on condition in R. I want to keep row of data based on the same condition across a large number of columns. So in the below example I want to keep rows from duplicated rows where hv value '0' at each column.

here is the data frame:

   ID  A B C
1  001 1 1 1
2  002 0 1 0
3  002 1 0 0
4  003 0 1 1
5  003 1 0 1
6  003 0 0 1

I want get like this:

   ID  A B C
1  001 1 1 1
2  002 0 0 0
3  003 0 0 1

Any help would be much appreciated, thanks!

CodePudding user response:

Please check this code

# A tibble: 6 × 4
     ID     A     B     C
  <dbl> <dbl> <dbl> <dbl>
1     1     1     1     1
2     2     0     1     0
3     2     1     0     0
4     3     0     1     1
5     3     1     0     1
6     3     0     0     1

code

data2 <- data %>% group_by(ID) %>% 
mutate(across(c('A','B','C'), ~ ifelse(.x==0, 0, NA), .names = 'x{col}')) %>% 
fill(xA, xB, xC) %>% 
mutate(across(c('xA','xB','xC'), ~ ifelse(is.na(.x), 1, .x))) %>% 
ungroup() %>% group_by(ID) %>% slice_tail(n=1)

output

# A tibble: 3 × 7
# Groups:   ID [3]
     ID     A     B     C    xA    xB    xC
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     1     1     1     1     1     1
2     2     1     0     0     0     0     0
3     3     0     0     1     0     0     1

  •  Tags:  
  • r
  • Related