Home > Net >  User Defined R function for comparing vectors
User Defined R function for comparing vectors

Time:03-26

So I am trying to write an R function called both_na(x,y), that takes two vectors (x and y) of the same length and returns the number of indices that have an NA in both vectors x and y. If the x and y are of different lengths, the function should print out “vectors are not of equal length” and return 0.

I am not being able to do the last part where vectors are not of equal length are to be printed.

both_na <- function(x, y) {
  sum(is.na(x) & is.na(y))
}

What type of if and else statement should I be using?

CodePudding user response:

This function does what you nedd.

both_na <- function(x, y){
  if(length(x) != length(y)) {
    print('Vectors are not of equal length')
    return(0)
  }
  
  which(is.na(x) & is.na(y))
}

x <- c(1, NA, 2, 3, NA, 5)
y <- c(1, NA, NA, 3, NA, 5)
z <- c(1, NA, NA, 3, NA)

both_na(x, y)
#> [1] 2 5
both_na(x, z)
#> [1] "Vectors are not of equal length"
#> [1] 0

Created on 2022-03-25 by the reprex package (v2.0.1)

CodePudding user response:

You could generalize this to an arbitrary number of vectors. The entered vectors are listed which facilitates to compare their lengths. Only if the variance is zero, we may cbind and count NAs in rows, otherwise we return 0.

f <- \(...) {
  lst <- list(...)
  stopifnot(length(lst) > 1L)
  ls <- lengths(lst)
  if (sum(ls) == 0L) {  ## all lengths null
    0L
  }
  else if (var(ls) != 0) {  ## different lengths
    warning("vectors are not of equal length")
    0L
  } else { 
    sum(rowSums(is.na(do.call(cbind, lst))) == 2)
  }
}

Usage

f(x, y)
# [1] 2
f(x, y[1:3])
# [1] 0
# Warning message:
#  In f(x, y[1:3]) : vectors are not of equal length
f(x, y, z)
# [1] 7
f(x, y[1:3], z)
# [1] 0
# Warning message:
# In f(x, y[1:3], z) : vectors are not of equal length
f(x, y, NULL)
# [1] 0
# Warning message:
# In f(x, y, NULL) : vectors are not of equal length
f(NULL, NULL)
# [1] 0
f(x)
f()
f(NULL)
# Error in f(NULL) : length(lst) > 1L is not TRUE

Data:

x <- c(1L, 1L, 1L, 1L, NA, NA, NA, NA, 1L, NA, 1L, NA, 1L, NA, 1L, 
1L, NA)

y <- c(NA, NA, NA, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, NA, NA, 
NA, NA)

z <- c(1L, NA, 1L, NA, NA, NA, 1L, 1L, NA, NA, NA, NA, NA, NA, NA, 
1L, NA)
  • Related