I have a matrix of data, where I want to check whether or not the absolute value of each column falls within a certain range. Moreover, I would like to calculate the proportion of times it occurs across all columns. I know how to do this manually but I would like to write this generally outside of a loop so that any time the user gives me a matrix X and y of any size that it works. The only additional piece of information is that the number of columns of X will always be the same length of y. I also would like to do this in base R if possible. Here is my R code:
set.seed(42)
# Made up data
x <- matrix(rnorm(27), nrow = 9)
y <- c(.2, .5, 2)
> sum(abs(x[,1]) <= y[1] & abs(x[,2]) <= y[2] & abs(x[,3]) <= y[3]) / nrow(x)
[1] 0.2222222
So ideally I would want something like
sum(abs(x) <= y) / nrow(x)
CodePudding user response:
sum(rowSums(t(t(abs(x)) <= y)) == ncol(x)) / nrow(x)
# [1] 0.2222222
Walk-through:
Unfortunately,
x > y
recyclesy
acrossx
, but column-wise, so it is effectively doingc(x[1,1] > y[1], x[2,1] > y[2], x[3,1] > y[3], x[4,1] > y[1], ...)
, which is not what we want. We cant
ransposex
so that the get the correct recycling ofy
... and then transpose it again to get it back in the same shape asx
(not strictly required).t(t(abs(x)) <= y) # [,1] [,2] [,3] # [1,] FALSE TRUE FALSE # [2,] FALSE FALSE TRUE # [3,] FALSE FALSE TRUE # [4,] FALSE FALSE TRUE # [5,] FALSE TRUE TRUE # [6,] TRUE TRUE TRUE # [7,] FALSE FALSE TRUE # [8,] TRUE TRUE TRUE # [9,] FALSE FALSE TRUE
Now we want to know how many rows have as many
TRUE
s asx
has columns, done withrowSums(.) == ncol(x)
. And the sum of all of these withsum(.)
.