I have the following dataset in R
and I want to calculate Shannon's entropy. In order to do that since the data are continuous, I have to discretise them. Using the discretize2d
function of Entropy
package, the entropy between $X_1$ and $X_2$ can be calculated as follows:
set.seed(1234)
data <- matrix(rnorm(150 * 11, mean = 0, sd = 1), 150, 11)
library(entropy)
dis <- discretize2d(data[,1],data[,2], numBins1 = 10, numBins2 = 10)
entropy(dis)
I want to create a list
containing all the discretize2d
results between between the variables of data
so i can later just use entropy(dis$1.2)
and getting the same result as entropy(dis)
. Can someone help me code it?
CodePudding user response:
Here is an all-base R solution. We use the combn(x, m)
function to generate all combinations of the elements of x
with size m
. Here we want pairs so m = 2
. This creates a 2 by 55 matrix. Then use apply()
to iteratively apply discretize2d()
over columns of that matrix. The second argument of apply()
is 2, meaning to apply over columns. We also specify simplify = FALSE
so that the result will stay a list rather than being coerced to an array.
combs <- combn(1:ncol(data), 2)
dis <- apply(combs, 2, function(x) discretize2d(dat[, x[1]], dat[, x[2]], numBins1 = 10, numBins2 = 10), simplify = FALSE)
If you want names for the elements like you specified, such as dis$1.2
, you can do this:
names(dis) <- apply(combs, 2, paste, collapse = '.')
Finally you could also calculate entropy for all elements at once with lapply()
:
lapply(dis, entropy)