Let's say I have this data frame of several random sentences
Sentences<-c("John is playing a video game at the moment","Tom will cook a delicious meal later",
"Kyle is with his friends watching the game",
"Diana is hosting her birthday party tomorrow night"
)
df<-data.frame(a)
keywords<-c("game","is","will","meal","birthday","party")
And I have a vector of key words. I need to create a new column in the data frame with only keywords mentioned in the sentence appearing.
na.omit(str_match(df[n,],keywords))
I have constructed this line of code which returns keywords that were used in those sentences (n stands for row number). How do I automate this code to be applied for each row?
CodePudding user response:
We could use str_extract_all
from stringr
package for this:
library(dplyr)
library(stringr)
df %>%
mutate(new_col = str_extract_all(Sentences, paste(keywords, collapse = "|")))
Sentences new_col
1 John is playing a video game at the moment is, game
2 Tom will cook a delicious meal later will, meal
3 Kyle is with his friends watching the game is, is, game
4 Diana is hosting her birthday party tomorrow night is, birthday, party