I would like to capture the occurrence of a "Yes" in the desired variables.
library(dplyr)
set.seed(2022)
mydata <- tibble::tibble(
"id" = 1:100,
"a1" = sample(c(rep("Yes", 40), rep_len(NA, 100)), 100),
"a2" = sample(c(rep("Yes", 50), rep_len(NA, 100)), 100),
"a3" = sample(c(rep("Yes", 40), rep_len(NA, 100)), 100),
"a4" = sample(c(rep("Yes", 50), rep_len(NA, 100)), 100),
"b2" = rnorm(100, 50, 10)
)
# Goal is to capture any occurrence of Yes in (a* variables)
anymatch <- function(vars){
rowSums(select(cur_data(), all_of(vars))=="Yes")
}
avars <- paste0("a", 1:4)
mydata %>%
mutate(afin = anymatch(avars)) %>%
select(avars, afin)
CodePudding user response:
We need na.rm = TRUE
anymatch <- function(vars){
rowSums(select(cur_data(), all_of(vars))=="Yes", na.rm = TRUE)
}
Now, it gives the correct count
> mydata %>%
mutate(afin = anymatch(avars)) %>%
select(all_of(avars), afin)
# A tibble: 100 × 5
a1 a2 a3 a4 afin
<chr> <chr> <chr> <chr> <dbl>
1 <NA> <NA> <NA> <NA> 0
2 <NA> Yes <NA> Yes 2
3 Yes <NA> <NA> <NA> 1
4 <NA> Yes Yes <NA> 2
5 Yes Yes <NA> <NA> 2
6 Yes Yes Yes Yes 4
7 <NA> Yes <NA> <NA> 1
8 <NA> <NA> <NA> <NA> 0
9 Yes Yes <NA> Yes 3
10 <NA> Yes <NA> <NA> 1
# … with 90 more rows
# ℹ Use `print(n = ...)` to see more rows
In the future versions, we may use pick
instead of cur_data()
anymatch <- function(vars){
rowSums(pick(all_of(vars))=="Yes", na.rm = TRUE)
}
mydata %>%
mutate(afin = anymatch(avars)) %>%
select(all_of(avars), afin)
# A tibble: 100 × 5
a1 a2 a3 a4 afin
<chr> <chr> <chr> <chr> <dbl>
1 <NA> <NA> <NA> <NA> 0
2 <NA> Yes <NA> Yes 2
3 Yes <NA> <NA> <NA> 1
4 <NA> Yes Yes <NA> 2
5 Yes Yes <NA> <NA> 2
6 Yes Yes Yes Yes 4
7 <NA> Yes <NA> <NA> 1
8 <NA> <NA> <NA> <NA> 0
9 Yes Yes <NA> Yes 3
10 <NA> Yes <NA> <NA> 1
# … with 90 more rows