Home > front end >  Multidimensional STAT in sweep function of R
Multidimensional STAT in sweep function of R

Time:08-18

How can I sweep the first two dimensions of an array based on the first columns of third dimension?

A simplified example:

My array:

a <- array(1:24,c(4,3,2))
> a
, , 1

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

, , 2

     [,1] [,2] [,3]
[1,]   13   17   21
[2,]   14   18   22
[3,]   15   19   23
[4,]   16   20   24

I would like to divide each matrix by its first column by sweep (or another) function. I would like to put the following output in a single array by using a single sweep function.

> sweep(a[,,1], 1, a[,,1][,1], "/")
     [,1]     [,2]     [,3]
[1,]    1 5.000000 9.000000
[2,]    1 3.000000 5.000000
[3,]    1 2.333333 3.666667
[4,]    1 2.000000 3.000000

> sweep(a[,,2], 1, a[,,2][,1], "/")
     [,1]     [,2]     [,3]
[1,]    1 1.307692 1.615385
[2,]    1 1.285714 1.571429
[3,]    1 1.266667 1.533333
[4,]    1 1.250000 1.500000

How can I achieve this?

CodePudding user response:

Using a simple for loop.

res <- array(dim=dim(a))
for (i in seq_len(dim(a)[3])) res[, , i] <- a[, , i]/a[, 1, i]

res
# , , 1
# 
#      [,1]     [,2]     [,3]
# [1,]    1 5.000000 9.000000
# [2,]    1 3.000000 5.000000
# [3,]    1 2.333333 3.666667
# [4,]    1 2.000000 3.000000
# 
# , , 2
# 
#      [,1]     [,2]     [,3]
# [1,]    1 1.307692 1.615385
# [2,]    1 1.285714 1.571429
# [3,]    1 1.266667 1.533333
# [4,]    1 1.250000 1.500000

This appears to outperform *apply functions.

a <- array(rnorm(1e3), c(4, 3, 20000))
microbenchmark::microbenchmark(sapply=array(sapply(seq_len(dim(a)[3]), \(x) a[, , x] / a[, 1, x]), dim=dim(a)),
                               vapply=array(vapply(seq_len(dim(a)[3]), \(x) a[, , x] / a[, 1, x], vector('numeric', prod(dim(a)[1:2]))), dim=dim(a)),`for`={res <- array(dim=dim(a))
                               for (i in seq_len(dim(a)[3])) res[, , i] <- a[, , i] / a[, 1, i]},
                               apply=apply(a, 3, function(x) x / x[,1]))
# Unit: milliseconds
#   expr       min        lq      mean    median        uq      max neval cld
# sapply  73.22494  78.46897  88.27141  86.18902  90.08711 232.8461   100  b 
# vapply  72.03503  78.24985  86.89068  84.87590  90.58196 226.0209   100  b 
#    for  58.87842  64.48287  74.33136  70.23162  77.25822 209.9818   100 a  
#  apply 110.48710 118.79229 130.51282 124.47029 134.64933 294.0801   100   c
  • Related