I have a dataset with a column of freeform text e.g.
dat <– tribble(
~id, ~freeform_text,
1, "some words to detect from",
2, "some more words to detect"
)
I want to create a function that takes a specified dataframe, searches a column for a given string and returns a new column indicating whether the string was detected.
I would like to do this using tidyverse syntax.
Here is what I have tried so far...
create_text_feature <- function(data, column, string) {
data %>%
mutate("{{string}}_detected" := ifelse(str_detect({{column}}, string), 1, 0))
}
Ideally, I would then run create_text_feature(dat, freeform_text, more)
and I would end up with the following dataset.
dat <– tribble(
~id, ~freeform_text, ~more_detected,
1, "some words to detect from", 0,
2, "some more words to detect", 1
)
I would be even more grateful if this could be created to take a list of strings and create multiple new columns in the same way.
CodePudding user response:
You could achieve your result by apssing the pattern as a quoted string and use single curly braces in the assignment:
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.1.2
#> Warning: package 'tibble' was built under R version 4.1.2
#> Warning: package 'tidyr' was built under R version 4.1.2
#> Warning: package 'readr' was built under R version 4.1.2
#> Warning: package 'dplyr' was built under R version 4.1.2
dat <- tribble(
~id, ~freeform_text,
1, "some words to detect from",
2, "some more words to detect"
)
create_text_feature <- function(data, column, string) {
data %>%
mutate("{string}_detected" := ifelse(str_detect({{column}}, string), 1, 0))
}
create_text_feature(dat, freeform_text, "more")
#> # A tibble: 2 x 3
#> id freeform_text more_detected
#> <dbl> <chr> <dbl>
#> 1 1 some words to detect from 0
#> 2 2 some more words to detect 1