I have a simplified dataframe as follows:
test <- data.frame(
x = c(1,2,3,NA,NA,NA),
y = c(NA, NA, NA, 3, 2, NA),
a = c(NA, NA, NA, NA, NA, TRUE)
)
I want to test if when there's NA in column x
, there will always be a numeric value in column y
AND when there's a numeric value in column 'x', there will always be a NA in column y. How can I do that?
Thanks!
CodePudding user response:
You can do:
(is.na(test$x) & is.numeric(test$y) & !(is.na(test$y))) |
(is.na(test$y) & is.numeric(test$x) & !(is.na(test$x)))
#[1] TRUE TRUE TRUE TRUE TRUE FALSE
CodePudding user response:
table(is.na(test[is.na(test$x),]$y))
This will tell you the counts of NAs in column y if x is NA in the same row.
CodePudding user response:
We could use
Reduce(` `, lapply(test[c('x', 'y')], \(x) is.numeric(x) & is.na(x))) == 1
[1] TRUE TRUE TRUE TRUE TRUE FALSE
CodePudding user response:
This will give you a TRUE/FALSE on whether the condition works
library(dplyr)
is_valid_1 <- test %>%
filter(is.na(x)) %>%
all(!is.na(y))
is_valid_2 <- test %>%
filter(!is.na(x)) %>%
all(is.na(y))