I want to create another column which gives 1 if one of the words in the pattern is present in the column RoleName OR FulltextDescription, this is because it might be that the RoleName only says 'VP' while the FulltextDescription says that the person is VP of the cyber department.
My code right now looks like this:
pattern <- paste(c("cyber", "Cyber", "technology", "Technology", "computer", "Computer"), collapse = "|")
IPEm <- IPEm %>%
mutate(`Cyber Job` = ifelse(str_detect(RoleName|FulltextDescription, pattern), 1, 0))
But unfortunately it doesn't work
CodePudding user response:
OR |
is a logical operator that works on booleans--things that are TRUE or FALSE.
You can't (meaningfully) use |
on a column unless that column is boolean (or you want it to be treated as boolean).
You can |
the results of str_detect
because str_detect
returns a boolean TRUE or FALSE:
str_detect(RoleName, pattern) | str_detect(FulltextDescription, pattern)
You could also concatenate the text with paste
and do a single str_detect
on the combined text:
str_detect(paste0(RoleName, FulltextDescription), pattern)
CodePudding user response:
You could also ignore_case
and simplify the pattern:
library(tidyverse)
pattern <- paste(c("cyber", "technology", "computer"), collapse = "|")
IPEm <- tribble(~RoleName, ~FulltextDescription,
"VP", "VP of Cyber dept.",
"VP Technology", "VP of Tech",
"VP", "VP of cyber dept.",
"VP", "VP Finance",
"VP technology", "VP of Tech"
)
IPEm %>%
mutate(cyber_job = if_else(str_detect(RoleName, regex(pattern, ignore_case = T)) |
str_detect(FulltextDescription, regex(pattern, ignore_case = T)), 1, 0))
#> # A tibble: 5 × 3
#> RoleName FulltextDescription cyber_job
#> <chr> <chr> <dbl>
#> 1 VP VP of Cyber dept. 1
#> 2 VP Technology VP of Tech 1
#> 3 VP VP of cyber dept. 1
#> 4 VP VP Finance 0
#> 5 VP technology VP of Tech 1
Created on 2022-06-27 by the reprex package (v2.0.1)