I need to check whether the variation of a word is in the text? How can I do that without typing everything out? For example, I need to search for the word 'broken', is there a way in r where it can look for the word and other variations?
a="Broken flask"
b="fragmented flask"
c="broke glass"
d="shattered glass"
e="break flask"
text=c(a,b,c,d,e)
str_detect(tolower(text),"broken|fragmented|broke|break|shatter|shattered")
CodePudding user response:
You could check out syn
from the syn
package, which generates synonyms for a given word, allowing you to do:
library(syn)
grepl(paste0(c("broken", syn("broken")), collapse = "|"), text, ignore.case = T)
#> [1] TRUE TRUE TRUE TRUE FALSE
It picked up 4 out of 5 here, without having to program any variations.