Home > Software design >  create many columns and set value based on previous columns by lapply
create many columns and set value based on previous columns by lapply

Time:09-09

I have a table with description of symptoms like below:

DT <- data.table(no = c(1, 2, 3),
symptom = c("headache and numbness", "tachycardia, sometimes headahce", "breath difficulty with limb numbness"))

the keywords I'm focusing on look like this

key.word <- list(
  head = c("head", "headache"),
  chest = c("breath", "tachycardia", "palpitaion")

I want to add two columns that describe whether the keyword is mentioned in the variable symptom, the desirable result looks like this

    result <- data.table(no = c(1, 2, 3),
                 symptom = c("headache and numbness", "tachycardia, sometimes headahce", "breath difficulty with limb numbness"),
                 head = c(T, T, F),
                 chest = c(F, T, T))

I can do this job by

DT[symptom %like% paste0(head, collapse = "|"), head := T]
DT[symptom %like% paste0(chest, collapse = "|"), chest := T]

But I'm wondering if there is a way to do this with lapply and datatable? (which seemed to be more elegant). Thanks in advance!

CodePudding user response:

Here is a data.table option

DT[ , lapply(
    key.word, function(x) any(sapply(x, function(w) grepl(w, symptom)))), 
    by = list(no, symptom)]
#   no                              symptom  head chest
#1:  1                headache and numbness  TRUE FALSE
#2:  2      tachycardia, sometimes headahce  TRUE  TRUE
#3:  3 breath difficulty with limb numbness FALSE  TRUE

The internal sapply loop is necessary as pattern in grepl is not vectorised.

CodePudding user response:

Not sure if you are looking only for DT or lapply option, but using dplyr and stringr:

DT %>% mutate(head = str_detect(symptom, str_c(key.word[[1]], collapse = '|')),
               chest =str_detect(symptom, str_c(key.word[[2]], collapse = '|')))
   no                              symptom  head chest
1:  1                headache and numbness  TRUE FALSE
2:  2      tachycardia, sometimes headahce  TRUE  TRUE
3:  3 breath difficulty with limb numbness FALSE  TRUE
  • Related