Given these two R dataframes of a set of freeform texts and a set of arbitrary keywords:
df <- as.data.frame(c(
"I have a social media account on Twitter",
"I love cheese recipes on Facebook",
"I love cheese recipes on Pinterest",
"I am a social media marketer on Instagram who loves social media",
"I love posting cheese recipes on social media",
"Conspiracy theories are logical fallacies"
)) |>
rename(phrase = 1)
keyword_df <- as.data.frame(c(
"social media",
"cheese recipe",
"tinfoil hat"
))
What's the easiest tidyverse way to create this outcome?
phrase | social_media | cheese_recipe | tinfoil_hat |
---|---|---|---|
I have a social media account on Twitter | 1 | 0 | 0 |
I love cheese recipes on Facebook | 0 | 1 | 0 |
I love cheese recipes on Pinterest | 0 | 1 | 0 |
I am a social media marketer on Instagram who loves social media | 2 | 0 | 0 |
I love posting cheese recipes on social media | 1 | 1 | 0 |
Conspiracy theories are logical fallacies | 0 | 0 | 0 |
CodePudding user response:
df %>%
mutate(as.data.frame(lapply(
setNames(nm = keyword_df[[1]]),
function(z) lengths(stringr::str_extract_all(phrase, z))
)))
# phrase social.media cheese.recipe tinfoil.hat
# 1 I have a social media account on Twitter 1 0 0
# 2 I love cheese recipes on Facebook 0 1 0
# 3 I love cheese recipes on Pinterest 0 1 0
# 4 I am a social media marketer on Instagram who loves social media 2 0 0
# 5 I love posting cheese recipes on social media 1 1 0
# 6 Conspiracy theories are logical fallacies 0 0 0