Home > OS >  Comparisons of more than two lists in R
Comparisons of more than two lists in R

Time:12-07

For my problem, I have 4 lists, each containing 10,000 elements. Let the lists be a, b, c, d. I can calculate Probablity(a<b) by just performing mean operation mean(a<b). If I understand correctly, it compares each of the 10,000 elements in a and b in order and tells me for how many elements a<b holds (with the number being divided by total elements).

Now, I want to compute Probablity(a<b<c<d). I want r to compare 10,000 elements in order and tell me for how many elements does (a<b<c<d) hold. However, I'm unable to do it using the mean function since it doesn't accept more than one < sign. How can I use the mean function here? I'm an absolute beginner in r, but logically, I feel this should be straightforward rather than looping over everything and having a count variable.

CodePudding user response:

How about this:

a <- runif(10000)
b <- runif(10000)
c <- runif(10000)
d <- runif(10000)
mean(a<b & b<c & c<d)
#> [1] 0.0425

Created on 2022-12-06 by the reprex package (v2.0.1)

CodePudding user response:

You need to use logical operators, which are & (and) and | (or). These two are the parallel versions. An example:

set.seed(7*11*13)  
n <- 100
a <- sample(1:1000, 100)
b <- sample(1:1000, 100)
c <- sample(1:1000, 100)
d <- sample(1:1000, 100)

mean( (a<b)&(b<c)&(c<d) )

CodePudding user response:

Using Rfast::coldiffs.

mean(rowSums(Rfast::coldiffs(matrix(c(A, B, C, D), length(A), 4)) > 0) == 3)
#> [1] 0.0397

Or multiplication

mean((A<B)*(B<C)*(C<D))

For whatever reason, multiplying/adding logical tends to be faster than &/| in possibly every case I've encountered:

microbenchmark::microbenchmark(
  coldiffs = mean(rowSums(Rfast::coldiffs(matrix(c(A, B, C, D), length(A), 4)) > 0) == 3),
  logical = mean(A<B & B<C & C<D),
  multiplication = mean((A<B)*(B<C)*(C<D)),
  check = "identical"
)
#> Unit: microseconds
#>            expr     min       lq     mean  median       uq      max neval
#>        coldiffs 253.701 384.6515 454.6581 402.551 429.6010 6795.401   100
#>         logical 131.201 160.5505 200.4389 166.051 173.1515 3510.901   100
#>  multiplication 100.601 122.5010 126.3041 126.001 132.2010  241.400   100

Data

A <- runif(1e4)
B <- runif(1e4)
C <- runif(1e4)
D <- runif(1e4)
  • Related