Home > Back-end >  Subset correlatio matrix based on value
Subset correlatio matrix based on value

Time:06-03

I have correlation matrix like this

   A    B    C      D    E     F
A  1    0.7  0.9   0.5  0.83   0.88
B  0.7  1    0.8   0.5  0.95   0.96
C  0.9  0.8  1     0.8  0.97   0.9
D  0.5  0.5  0.97   1   0.87   0.91
E 0.83  0.95 0.97  0.87  1     0.81
F 0.88  0.96 0.9   0.91 0.81    1

I want to subset this correlation matrix and take only variable that have >= 0.8 for that i need to avoid comparaison of diagonal (because all the values equal to 1)

so the result will be

  C      E      F
C 1      0.97   0.9
E 0.97   1      0.81
F 0.9    0.81    1

Thank you

CodePudding user response:

An idea can be to create a logical matrix with the >= values than 0.8 and use rowSums and colSums and filter when the TRUEs are the same as your matrix dimensions

m1[rowSums(m1 >= 0.8) == nrow(m1), colSums(m1 >= 0.8) == nrow(m1)]

     C    E    F
C 1.00 0.97 0.90
E 0.97 1.00 0.81
F 0.90 0.81 1.00

CodePudding user response:

You can try this

> idx <- names(which(rowSums(m >= 0.8) == ncol(m)))

> m[idx, idx]
     C    E    F
C 1.00 0.97 0.90
E 0.97 1.00 0.81
F 0.90 0.81 1.00
  • Related