I'm trying to exclude certain rows based on certain conditions: Criteria A - exclude if imm1 or imm 2 = bio1 or bio2 (i.e. would exclude row 1) Criteria B - exclude if imm1 or imm 2 = surg (i.e. would exclude row 2)
meta_CD (name of data frame)
Row no | imm1 | imm2 | bio1 | bio2 | surg |
---|---|---|---|---|---|
1 | 2009 | 2010 | 2010 | NA | NA |
2 | 2004 | NA | 2015 | NA | 2004 |
3 | 2009 | 2009 | NA | NA | NA |
4 | 2015 | NA | NA | NA | NA |
Just wondered how I would do this in R please.
Thanks in advance.
CodePudding user response:
Quick answer: To exclude rows if imm1 not in list bio1, bio2, you can use:
meta_CD <- meta_CD[! meta_CD$imm1 %in%('bio1','bio2'),]
To add imm2:
meta_CD <- meta_CD[! meta_CD$imm1 %in%('bio1','bio2') | ! meta_CD$imm2 %in%('bio1','bio2') ,]
may be there is a better solution.
CodePudding user response:
Try filter
ing in dplyr
:
library(dplyr)
meta_CD %>%
filter(! # exclude
# Criteria A:
(imm1==bio1|imm1==bio2|imm2==bio1|imm2==bio2
| # or:
# Criteria B:
imm1==surg|imm2==surg)
)
CodePudding user response:
%in%
could be used when used inside mapply
.
A <- mapply(function(a, b) any(a[!is.na(a)] %in% b[!is.na(b)]),
asplit(x[c("imm1", "imm2")], 1), asplit(x[c("bio1", "bio2")], 1))
B <- mapply(function(a, b) any(a[!is.na(a)] %in% b[!is.na(b)]),
asplit(x[c("imm1", "imm2")], 1), asplit(x[c("surg")], 1))
x[!(A | B),]
# Row.no imm1 imm2 bio1 bio2 surg
#3 3 2009 2009 NA NA NA
#4 4 2015 NA NA NA NA
Data
x <- read.table(header=TRUE, text="Row.no imm1 imm2 bio1 bio2 surg
1 2009 2010 2010 NA NA
2 2004 NA 2015 NA 2004
3 2009 2009 NA NA NA
4 2015 NA NA NA NA")