Home > database >  generate and ID based in one condition
generate and ID based in one condition

Time:03-30

I have 100 pdf medical reports of different persons, I included each report into a list in R, they have two columns with a lot of different information each one, but I just want the reports that have the gallbladder tissue, so I want to create an ID for the all report nut only the rows that contain the word "gallbladder". Then I want to filter only the gallbladder reports to extract further information. These is how it looks each element of the list (They have much more information)

list[[1]]
report text text_2
1 name andres
1 tissue gallbladder 1 rut 11455698

list[[2]]
report text text_2
2 name ana
2 tissue liver
2 rut 5556678

I want to create the ID according to tissue : gallbladder

list[[1]]
report text text_2 ID 1 name andres 1 1 tissue gallbladder 1 1 rut 11455698 1

list[[2]]
report text text_2 ID 2 name ana 0 2 tissue liver 0 2 rut 5556678 0

then i want to filter only the reports that the ID==1

I tried many ways but i just have the ID for the row, not for the all report.

list[[1]]
report text text_2 ID 1 name andres 0 1 tissue gallbladder 1 1 rut 11455698 0

list[[2]]
report text text_2 ID 2 name ana 0 2 tissue liver 0 2 rut 5556678 0

Maybe you have some ideas! Thank you!

CodePudding user response:

We may loop over the list with lapply, then create the ID, column by checking if there are any value in 'text_2' column as "gallbladder" - any ensure to return a single TRUE/FALSE which gets recycled for the entire data in the list and this logical column is coerced to binary with as.integer or just

list2 <- lapply(list, function(x) 
     transform(x, ID =  (any(text_2 == "gallbladder"))))
  •  Tags:  
  • r
  • Related