Home > Software design >  Excluding rows based on certain conditions in corresponding columns
Excluding rows based on certain conditions in corresponding columns

Time:06-29

I'm trying to exclude certain rows based on certain conditions: Criteria A - exclude if imm1 or imm 2 = bio1 or bio2 (i.e. would exclude row 1) Criteria B - exclude if imm1 or imm 2 = surg (i.e. would exclude row 2)

meta_CD (name of data frame)

Row no imm1 imm2 bio1 bio2 surg
1 2009 2010 2010 NA NA
2 2004 NA 2015 NA 2004
3 2009 2009 NA NA NA
4 2015 NA NA NA NA

Just wondered how I would do this in R please.

Thanks in advance.

CodePudding user response:

Quick answer: To exclude rows if imm1 not in list bio1, bio2, you can use:

meta_CD <- meta_CD[! meta_CD$imm1 %in%('bio1','bio2'),]

To add imm2:

meta_CD <- meta_CD[! meta_CD$imm1 %in%('bio1','bio2') | ! meta_CD$imm2 %in%('bio1','bio2') ,]

may be there is a better solution.

CodePudding user response:

Try filtering in dplyr:

library(dplyr)
meta_CD %>%
  filter(! # exclude
    # Criteria A:
    (imm1==bio1|imm1==bio2|imm2==bio1|imm2==bio2
    | # or:
    # Criteria B:
    imm1==surg|imm2==surg)
  )

CodePudding user response:

%in% could be used when used inside mapply.

A <- mapply(function(a, b) any(a[!is.na(a)] %in% b[!is.na(b)]),
            asplit(x[c("imm1", "imm2")], 1), asplit(x[c("bio1", "bio2")], 1))

B <- mapply(function(a, b) any(a[!is.na(a)] %in% b[!is.na(b)]),
            asplit(x[c("imm1", "imm2")], 1), asplit(x[c("surg")], 1))

x[!(A | B),]
#  Row.no imm1 imm2 bio1 bio2 surg
#3      3 2009 2009   NA   NA   NA
#4      4 2015   NA   NA   NA   NA

Data

x <- read.table(header=TRUE, text="Row.no   imm1    imm2    bio1    bio2    surg
1   2009    2010    2010    NA  NA
2   2004    NA  2015    NA  2004
3   2009    2009    NA  NA  NA
4   2015    NA  NA  NA  NA")
  • Related