Home > Back-end >  Fast way to search for a series of keywords among articles
Fast way to search for a series of keywords among articles

Time:11-20

To illustrate with an example:

I have a few keywords (case sensitive).

 kw <- c("American Express", "Inc said")

I have quite a few articles.

 data("acq")
 dv <- sapply(1:length(acq),function(x) acq[[x]]$content) #doing data transformation so that dv is just a vector of strings

I want the following table as an output

temp <- sapply(1:length(kw),function(x) stringr::str_detect(dv,kw[x]))

The problem is, I have millions of records and the method that I am using is not efficient enough.

CodePudding user response:

What about parallelizing? This is an example based on your code:

library(parallel)

n_cores <- 2 # number of cores for parallel processing
cl <- makeCluster(n_cores)
emp <- parSapply(cl, 1:length(acq), FUN=function(x,i) str_detect(acq[[x]]$content,kw[I]))
stopCluster(cl)
  • Related