Home > Enterprise >  How to find the vectorized difference between two lists?
How to find the vectorized difference between two lists?

Time:09-11

Consider

x <- c(1,2,3)
y <- c(1,0,3)
x == y
[1]  TRUE FALSE  TRUE

I would like to achieve the same result using lists:

x <- list(c(11,11) ,c(22,22), c(33,33) )
y <- list(c(11,11) ,c(21,22), c(33,33) )
x == y

But instead of [1] TRUE FALSE TRUE this gives the error message:

Error in x == y : comparison of these types is not implemented

How to find the vectorized difference between two lists?

Performance must be considered.

CodePudding user response:

We could use Map/mapply to loop over the corresponding elements

mapply(setequal, x, y)
[1]  TRUE FALSE  TRUE

Or with == and all

mapply(\(u, v) all(u == v, na.rm = TRUE), x, y)
[1]  TRUE FALSE  TRUE

For a vectorized option (assuming if the lengths are same), then unlist the elements, do the ==, convert to matrix and get the rowSums of logical matrix

rowSums(matrix(unlist(x) == unlist(y), nrow = length(x), byrow = TRUE)) == 2
[1]  TRUE FALSE  TRUE

Or a variation of the above

rowSums(do.call(rbind, x) == do.call(rbind, y)) == 2
[1]  TRUE FALSE  TRUE

Benchmarks

> x1 <- rep(x, 1e6)
> y1 <- rep(y, 1e6)
> system.time(mapply(setequal, x1, y1))
   user  system elapsed 
 14.450   0.125  15.013 
> system.time(mapply(\(u, v) all(u == v, na.rm = TRUE), x1, y1))
   user  system elapsed 
  3.535   0.020   3.550 
> system.time(rowSums(matrix(unlist(x1) == unlist(y1),
    nrow = length(x1), byrow = TRUE)) == 2)
   user  system elapsed 
  0.268   0.038   0.307 
> system.time(rowSums(do.call(rbind, x1) == do.call(rbind, y1)) == 2)
   user  system elapsed 
  5.176   0.321   5.524 
> system.time({
   out <- logical(length(x1))
   for(i in seq_along(x1)) {
    out[i] <- all(x1[[i]] == y1[[i]], na.rm = TRUE)
    }
 })
   user  system elapsed 
  1.189   0.055   1.242 
  # base comparison with vectors
> system.time( seq(1e6) == seq(1e6))
   user  system elapsed 
  0.004   0.001   0.006 
  •  Tags:  
  • r
  • Related