Consider
x <- c(1,2,3)
y <- c(1,0,3)
x == y
[1] TRUE FALSE TRUE
I would like to achieve the same result using lists:
x <- list(c(11,11) ,c(22,22), c(33,33) )
y <- list(c(11,11) ,c(21,22), c(33,33) )
x == y
But instead of
[1] TRUE FALSE TRUE
this gives the error message:
Error in x == y : comparison of these types is not implemented
How to find the vectorized difference between two lists?
Performance must be considered.
CodePudding user response:
We could use Map/mapply
to loop over the corresponding elements
mapply(setequal, x, y)
[1] TRUE FALSE TRUE
Or with ==
and all
mapply(\(u, v) all(u == v, na.rm = TRUE), x, y)
[1] TRUE FALSE TRUE
For a vectorized option (assuming if the lengths are same), then unlist
the elements, do the ==
, convert to matrix
and get the rowSums
of logical matrix
rowSums(matrix(unlist(x) == unlist(y), nrow = length(x), byrow = TRUE)) == 2
[1] TRUE FALSE TRUE
Or a variation of the above
rowSums(do.call(rbind, x) == do.call(rbind, y)) == 2
[1] TRUE FALSE TRUE
Benchmarks
> x1 <- rep(x, 1e6)
> y1 <- rep(y, 1e6)
> system.time(mapply(setequal, x1, y1))
user system elapsed
14.450 0.125 15.013
> system.time(mapply(\(u, v) all(u == v, na.rm = TRUE), x1, y1))
user system elapsed
3.535 0.020 3.550
> system.time(rowSums(matrix(unlist(x1) == unlist(y1),
nrow = length(x1), byrow = TRUE)) == 2)
user system elapsed
0.268 0.038 0.307
> system.time(rowSums(do.call(rbind, x1) == do.call(rbind, y1)) == 2)
user system elapsed
5.176 0.321 5.524
> system.time({
out <- logical(length(x1))
for(i in seq_along(x1)) {
out[i] <- all(x1[[i]] == y1[[i]], na.rm = TRUE)
}
})
user system elapsed
1.189 0.055 1.242
# base comparison with vectors
> system.time( seq(1e6) == seq(1e6))
user system elapsed
0.004 0.001 0.006