Home > front end >  Counting elements inside a matrix
Counting elements inside a matrix

Time:01-27

I'm generating random matrices filled with zero and ones. The dimension of them might be different for each simulation.

An example matrix below

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    1    0     0
 [2,]    0    1    1    0    0    0    0    0    0     0
 [3,]    0    0    0    0    1    0    0    0    0     1
 [4,]    0    1    0    0    0    0    0    0    0     0
 [5,]    0    0    0    0    1    0    0    0    0     1
 [6,]    1    0    1    0    0    0    1    1    1     0
 [7,]    0    0    0    0    0    0    1    1    0     0
 [8,]    0    0    0    0    0    0    0    0    0     0
 [9,]    0    0    1    0    0    1    0    0    1     1
[10,]    0    0    0    0    0    0    0    1    0     0

And a little visualisation

enter image description here

Dput version.

structure(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0), .Dim = c(10L, 
10L))

I would like to compute two things:

  1. the number of clusters formed by ones (by cluster we mean a set of adjacent ones, where the elements on the diagonal are not adjacent),
  2. the number of ones within each cluster.

I think I managed to solve the first point with this function

library(raster)
count_clusters <- function(grid) {
  attr(clump(raster(grid), direc=4), 'data')@max
}

This function would return 14 for the matrix above which is correct.

Unfortunately I don't how to solve the second task. The needed function should return the following output: c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 5).

I would appreciate any hints or tips.

CodePudding user response:

To compute the number of ones within each cluster:

grid <-structure(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 
                   0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
                   0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
                   0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 
                   0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0), .Dim = c(10L, 
                                                                                         10L))                                                                                                                                                                                    10L))
x <- clump(raster(grid), direc=4)

get the values from the RasterLayer @data@values.

vals <- x@data@values

Create a data frame with the values:

dt <- tibble(cluster = vals)

Remove NA values, group by cluster and count

result <- dt %>% 
  filter(!is.na(cluster)) %>%
  group_by(cluster) %>% 
  tally()

result$n
 [1] 1 2 1 1 1 1 1 1 1 5 1 1 2 1
  •  Tags:  
  • Related