Home > OS >  How can I use str_detect() in a function to create new columns?
How can I use str_detect() in a function to create new columns?

Time:04-09

I have a dataset with a column of freeform text e.g.

dat <– tribble(
  ~id, ~freeform_text,
  1, "some words to detect from",
  2, "some more words to detect"
)

I want to create a function that takes a specified dataframe, searches a column for a given string and returns a new column indicating whether the string was detected.

I would like to do this using tidyverse syntax.

Here is what I have tried so far...

create_text_feature <- function(data, column, string) {
  data %>% 
    mutate("{{string}}_detected" := ifelse(str_detect({{column}}, string), 1, 0))
}

Ideally, I would then run create_text_feature(dat, freeform_text, more) and I would end up with the following dataset.

dat <– tribble(
  ~id, ~freeform_text, ~more_detected,
  1, "some words to detect from", 0,
  2, "some more words to detect", 1
)

I would be even more grateful if this could be created to take a list of strings and create multiple new columns in the same way.

CodePudding user response:

You could achieve your result by apssing the pattern as a quoted string and use single curly braces in the assignment:

library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.1.2
#> Warning: package 'tibble' was built under R version 4.1.2
#> Warning: package 'tidyr' was built under R version 4.1.2
#> Warning: package 'readr' was built under R version 4.1.2
#> Warning: package 'dplyr' was built under R version 4.1.2

dat <- tribble(
  ~id, ~freeform_text,
  1, "some words to detect from",
  2, "some more words to detect"
)

create_text_feature <- function(data, column, string) {
  data %>% 
    mutate("{string}_detected" := ifelse(str_detect({{column}}, string), 1, 0))
}


create_text_feature(dat, freeform_text, "more")
#> # A tibble: 2 x 3
#>      id freeform_text             more_detected
#>   <dbl> <chr>                             <dbl>
#> 1     1 some words to detect from             0
#> 2     2 some more words to detect             1
  • Related