In this example data, I want filter out those rows where col1
and col2
are both zeros.
df <- data.frame(
id = rep(letters[1:3], each = 2),
col1 = c(1, 1, 0, 0, 0, 0),
col2 = c(0, 1, 0, 0, 1, 0),
col3 = LETTERS[1:6]
)
This is what I tried:
library(dplyr)
df %>%
filter(col1 == 1 | col2 == 1)
# id col1 col2 col3
# 1 a 1 0 A
# 2 a 1 1 B
# 3 c 0 1 E
It works fine but all rows where id
is "b"
are removed. Are there any ways to keep at least one row for each id
? My expected output is:
# id col1 col2 col3
# 1 a 1 0 A
# 2 a 1 1 B
# 3 b NA NA -
# 4 c 0 1 E
CodePudding user response:
You could try grouping:
library(dplyr)
df %>%
group_by(id) %>%
filter(col1 == 1 | col2 == 1 | row_number() == 1) %>%
ungroup()
This returns
# A tibble: 4 x 4
id col1 col2 col3
<chr> <dbl> <dbl> <chr>
1 a 1 0 A
2 a 1 1 B
3 b 0 0 C
4 c 0 1 E
CodePudding user response:
You can use tidyr::complete()
:
df %>%
filter(col1 == 1 | col2 == 1) %>%
tidyr::complete(id = df$id, fill = list(col3 = "-"))
# # A tibble: 4 × 4
# id col1 col2 col3
# <chr> <dbl> <dbl> <chr>
# 1 a 1 0 A
# 2 a 1 1 B
# 3 b NA NA -
# 4 c 0 1 E