I have been struggling for hours with this dataset. I have searched for hours and tried many things, but I failed (I am a novice in R). So I really hope you guys can help me.
I have this dataset:
df <- data.frame(ID = c(1,2,3,4,5), a.1 = c("A", "C", "C", "B","D"), a.2 = c("C", "C", "D", "A","B"), b.1 = c("D", "C", "A", "B","D"), b.2 = c("D", "B", "C", "A","A"))
ID a.1 a.2 b.1 b.2
1 1 A C D D
2 2 C C C B
3 3 C D A C
4 4 B A B A
5 5 D B D A
I would like to mutate a new variable called "result" to be:
- "1" if one of the columns with prefix "a." contain "A" or "B"
- "0" if one of the columns with prefix "a." do not contain "A" or "B"
So I would get the following result:
ID a.1 a.2 b.1 b.2 result
1 1 A C D D 1
2 2 C C C B 0
3 3 C D A C 0
4 4 B A B A 1
5 5 D B D A 1
In my real dataset I have 100 variables with prefix "a.", so I cannot select the columns individually.
Hopefully you guys can help me!
Thank you very much!
CodePudding user response:
library(dplyr)
df %>%
rowwise() %>%
mutate(res = any(c_across(starts_with("a.")) %in% c("A", "B")) * 1L)
#> # A tibble: 5 x 6
#> # Rowwise:
#> ID a.1 a.2 b.1 b.2 res
#> <dbl> <chr> <chr> <chr> <chr> <int>
#> 1 1 A C D D 1
#> 2 2 C C C B 0
#> 3 3 C D A C 0
#> 4 4 B A B A 1
#> 5 5 D B D A 1
CodePudding user response:
Thank you very much!!
It is crazy how little of code you need for this, while i was struggling so long. There is still enough to learn for me. :)