Home > Software engineering >  Mutate variable if certain columns contain string in R
Mutate variable if certain columns contain string in R

Time:11-18

I have been struggling for hours with this dataset. I have searched for hours and tried many things, but I failed (I am a novice in R). So I really hope you guys can help me.

I have this dataset:

      df <- data.frame(ID = c(1,2,3,4,5), a.1 = c("A", "C", "C", "B","D"), a.2 = c("C", "C", "D", "A","B"), b.1 = c("D", "C", "A", "B","D"), b.2 = c("D", "B", "C", "A","A"))
    
  ID a.1 a.2 b.1 b.2
1  1   A   C   D   D
2  2   C   C   C   B
3  3   C   D   A   C
4  4   B   A   B   A
5  5   D   B   D   A

I would like to mutate a new variable called "result" to be:

  • "1" if one of the columns with prefix "a." contain "A" or "B"
  • "0" if one of the columns with prefix "a." do not contain "A" or "B"

So I would get the following result:

  ID a.1 a.2 b.1 b.2 result
1  1   A   C   D   D      1
2  2   C   C   C   B      0
3  3   C   D   A   C      0
4  4   B   A   B   A      1
5  5   D   B   D   A      1

In my real dataset I have 100 variables with prefix "a.", so I cannot select the columns individually.

Hopefully you guys can help me!

Thank you very much!

CodePudding user response:

library(dplyr)

df %>% 
  rowwise() %>% 
  mutate(res = any(c_across(starts_with("a.")) %in% c("A", "B")) * 1L)

#> # A tibble: 5 x 6
#> # Rowwise: 
#>      ID a.1   a.2   b.1   b.2     res
#>   <dbl> <chr> <chr> <chr> <chr> <int>
#> 1     1 A     C     D     D         1
#> 2     2 C     C     C     B         0
#> 3     3 C     D     A     C         0
#> 4     4 B     A     B     A         1
#> 5     5 D     B     D     A         1

CodePudding user response:

Thank you very much!!

It is crazy how little of code you need for this, while i was struggling so long. There is still enough to learn for me. :)

  • Related