I'm stuck trying to keep row based on condition in R. I want to keep row of data based on the same condition across a large number of columns. So in the below example I want to keep rows from duplicated rows where hv value '0' at each column.
here is the data frame:
ID A B C
1 001 1 1 1
2 002 0 1 0
3 002 1 0 0
4 003 0 1 1
5 003 1 0 1
6 003 0 0 1
I want get like this:
ID A B C
1 001 1 1 1
2 002 0 0 0
3 003 0 0 1
Any help would be much appreciated, thanks!
CodePudding user response:
Please check this code
# A tibble: 6 × 4
ID A B C
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 1
2 2 0 1 0
3 2 1 0 0
4 3 0 1 1
5 3 1 0 1
6 3 0 0 1
code
data2 <- data %>% group_by(ID) %>%
mutate(across(c('A','B','C'), ~ ifelse(.x==0, 0, NA), .names = 'x{col}')) %>%
fill(xA, xB, xC) %>%
mutate(across(c('xA','xB','xC'), ~ ifelse(is.na(.x), 1, .x))) %>%
ungroup() %>% group_by(ID) %>% slice_tail(n=1)
output
# A tibble: 3 × 7
# Groups: ID [3]
ID A B C xA xB xC
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 1 1 1 1
2 2 1 0 0 0 0 0
3 3 0 0 1 0 0 1