Home > Mobile >  R filtering based on part of value (character)
R filtering based on part of value (character)

Time:04-27

I have a data frame with the column "Role", which consists of character answers.

example data frame:

N Year Name Role              
1 2010 A    CEO
2 2012 A    CEO
3 2010 B    Chairman/CEO
4 2015 B    Chairman/CEO
5 2016 C    CEO/CFO/Chairman

I want to give a value of 1 to all observations which have "Chairman" in their roles, and 0 to all observations without having "Chairman" as a role. The wanted output would look as followed:

N Year Name Role               Chairman
1 2010 A    CEO                0
2 2012 A    CEO                0
3 2010 B    Chairman/CEO       1  
4 2015 B    Chairman/CEO       1
5 2016 C    CEO/CFO/Chairman   1

The "Role" column has about 150 unique answers, writing an if-loop with all separate options is impossible.

Is there a way to code this to get this wanted output?

Thank you all in advance!

CodePudding user response:

You could use grepl and ifelse:

df$Chairman <- ifelse(grepl('Chairman', df$Role), 1, 0)

df
#>   N Year Name             Role Chairman
#> 1 1 2010    A              CEO        0
#> 2 2 2012    A              CEO        0
#> 3 3 2010    B     Chairman/CEO        1
#> 4 4 2015    B     Chairman/CEO        1
#> 5 5 2016    C CEO/CFO/Chairman        1

Or even just

df$Chairman <- as.numeric(grepl('Chairman', df$Role))

Or, if "Chairman` sometimes appears without a capital, you can use

df$Chairman <- as.numeric(grepl('Chairman', df$Role, ignore.case = TRUE))

Created on 2022-04-26 by the reprex package (v2.0.1)

CodePudding user response:

str_detect() is an easy way:

library(dplyr)
tribble(~N, ~Year, ~Name, ~Role,              
    1, 2010, "A",    "CEO",
    2, 2012, "A",    "CEO",
    3, 2010, "B",    "Chairman/CEO",
    4, 2015, "B",    "Chairman/CEO",
    5, 2016, "C",    "CEO/CFO/Chairman"
    ) %>% 
      mutate(Chairman= ifelse(stringr::str_detect(Role, "Chairman"),1,0))
  •  Tags:  
  • r
  • Related