Home > Mobile >  Test structure of dataframe in R
Test structure of dataframe in R

Time:12-24

I have a simplified dataframe as follows:

test <- data.frame(
        x = c(1,2,3,NA,NA,NA),
        y = c(NA, NA, NA, 3, 2, NA),
        a = c(NA, NA, NA, NA, NA, TRUE)
        )

I want to test if when there's NA in column x, there will always be a numeric value in column y AND when there's a numeric value in column 'x', there will always be a NA in column y. How can I do that?

Thanks!

CodePudding user response:

You can do:

(is.na(test$x) & is.numeric(test$y) & !(is.na(test$y))) |
  (is.na(test$y) & is.numeric(test$x) & !(is.na(test$x)))
#[1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE

CodePudding user response:

table(is.na(test[is.na(test$x),]$y))

This will tell you the counts of NAs in column y if x is NA in the same row.

CodePudding user response:

We could use

Reduce(` `, lapply(test[c('x', 'y')], \(x) is.numeric(x) & is.na(x))) == 1
[1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE

CodePudding user response:

This will give you a TRUE/FALSE on whether the condition works

library(dplyr)

is_valid_1 <- test %>%
  filter(is.na(x)) %>%
  all(!is.na(y))

is_valid_2 <- test %>%
  filter(!is.na(x)) %>%
  all(is.na(y))
  • Related