I am want to create a dummy variable that is 1 if it is contains a part of the numbers. For some reason the str_detect is not working. My error code is as follows:
Error in type(pattern) : argument "pattern" is missing, with no default
sam_data_rd$high_int <- as.integer(str_detect(sam_data_rd$assertions.primarynaics,
c("2111", "3254", "3341", "3342", "3344","3345", "3364", "5112", "5171", "51331",
"5179", "5133Z", "5182", "5191", "5142", "5141Z", "5191Z","5191ZM", "5413", "5415", "5417")))
CodePudding user response:
The pattern can be a single string with OR (|
). Note that the pattern
is vectorized to allow multiple elements, but the condition is that the length
of the pattern
should match the length
of the string
(or the column i.e. it will be an elementwise comparison)
library(stringr)
v1 <- c("2111", "3254", "3341", "3342", "3344","3345", "3364", "5112", "5171", "51331", "5179", "5133Z", "5182", "5191", "5142", "5141Z", "5191Z","5191ZM", "5413", "5415", "5417")
pat <- str_c("\\b(", str_c(v1, collapse = "|"), ")\\b")
sam_data_rd$high_int <-
as.integer(str_detect(sam_data_rd$assertions.primarynaics, pat))
Or another option is to loop over each of the elements and then reduce
it to a single logical vector
library(purrr)
library(dplyr)
sam_data_rd <- sam_data_rd %>%
mutate(high_int = map(v1,
~ str_detect(assertions.primarynaics, .x)) %>%
reduce(`|`) %>%
as.integer)
CodePudding user response:
Try this:
library(dplyr)
library(stringr)
pattern <- paste(c("2111", "3254", "3341", "3342", "3344","3345", "3364", "5112", "5171", "51331",
"5179", "5133Z", "5182", "5191", "5142", "5141Z", "5191Z","5191ZM", "5413", "5415", "5417"), collapse = "|")
sam_data_rd %>%
mutate(high_int = ifelse(str_detect(assertions.primarynaics, pattern), 1, assertions.primarynaics)