Home > OS >  dummy variable in R for partial string
dummy variable in R for partial string

Time:07-24

I am want to create a dummy variable that is 1 if it is contains a part of the numbers. For some reason the str_detect is not working. My error code is as follows:

Error in type(pattern) : argument "pattern" is missing, with no default

sam_data_rd$high_int <- as.integer(str_detect(sam_data_rd$assertions.primarynaics,
                                              c("2111", "3254", "3341", "3342", "3344","3345", "3364", "5112", "5171", "51331", 
                                    "5179", "5133Z", "5182", "5191", "5142", "5141Z", "5191Z","5191ZM", "5413", "5415", "5417")))

CodePudding user response:

The pattern can be a single string with OR (|). Note that the pattern is vectorized to allow multiple elements, but the condition is that the length of the pattern should match the length of the string (or the column i.e. it will be an elementwise comparison)

library(stringr)
v1 <- c("2111", "3254", "3341", "3342", "3344","3345", "3364", "5112", "5171", "51331",                                      "5179", "5133Z", "5182", "5191", "5142", "5141Z", "5191Z","5191ZM", "5413", "5415", "5417")
pat <- str_c("\\b(", str_c(v1, collapse = "|"), ")\\b")

sam_data_rd$high_int <- 
     as.integer(str_detect(sam_data_rd$assertions.primarynaics, pat))

Or another option is to loop over each of the elements and then reduce it to a single logical vector

library(purrr)
library(dplyr)
sam_data_rd <- sam_data_rd %>%
     mutate(high_int = map(v1,
    ~ str_detect(assertions.primarynaics, .x)) %>% 
      reduce(`|`) %>% 
      as.integer)

CodePudding user response:

Try this:

library(dplyr)
library(stringr)

pattern <- paste(c("2111", "3254", "3341", "3342", "3344","3345", "3364", "5112", "5171", "51331", 
                    "5179", "5133Z", "5182", "5191", "5142", "5141Z", "5191Z","5191ZM", "5413", "5415", "5417"), collapse = "|")
sam_data_rd %>% 
  mutate(high_int = ifelse(str_detect(assertions.primarynaics, pattern), 1, assertions.primarynaics)
  •  Tags:  
  • r
  • Related