Home > Net >  how to write a loop to count elements in all column of a dataframe in r
how to write a loop to count elements in all column of a dataframe in r

Time:12-29

How can I write a loop that counts all elements in all columns of a dataframe in r and shows the result?

This is what I tried but I fail to show the result:

count_1 <- for (col in colnames(df)){x = count(df, col)}

CodePudding user response:

Assuming that each cell in the data frame only contains one element, and not a list or vector itself, you could just multiply the dimensions. For example:

set.seed(23L)
A <- data.frame(A = rnorm(10), B = sample(LETTERS, 10), C = runif(10, 1, 3))
A
prod(dim(A))

This results in:

             A B        C
1   0.19321233 V 1.698088
2  -0.43468211 M 1.658262
3   0.91326710 J 2.619008
4   1.79338809 L 1.548964
5   0.99660511 X 2.878190
6   1.10749049 F 2.804534
7  -0.27808628 G 1.354106
8   1.01920549 B 2.596503
9   0.04543718 O 1.165596
10  1.57577959 N 1.571384

[1] 30

If you have vectors within cells of the data frame, it gets a bit more complicated. We cant use vapply or sapply using length as that would only give the length of the table itself, not the length inside each cell. However, rapply is your friend here. If we call rapply on a list, and a data frame is a special kind of list, we will get the sums in each cell, so sum(rapply(A, length)) should do what you want. Here is an example:

A <- data.frame(A=double(2), B=double(2), C=character(2))
A$A <- list(c(1, 2), c(3, 4))
A$B <- list(7:9, 8:10)
A$C <- list(c('A', 'B'), c('D', 'E'))
A
     A        B    C
1 1, 2  7, 8, 9 A, B
2 3, 4 8, 9, 10 D, E

dim(A)
[1] 2 3

vapply(A, length, integer(1L))
A B C 
2 2 2 

rapply(A, length)
A1 A2 B1 B2 C1 C2 
 2  2  3  3  2  2 

sum(rapply(A, length))
[1] 14

And 14 is the correct answer.

CodePudding user response:

If you really want to make a loop then, where i is each column and j is each element of the column, do

k <- 0
for(i in iris){
  for(j in i){
    k <- k 1
  }
}
k

But better to use nrow(df) * ncol(df) or length(df$x) * length(df) or something similar

CodePudding user response:

Here I use the data iris. You can use a loop like this to create a list and then create a table for each column in the data frame:

freq <- vector("list", length(iris))

for (i in seq_along(iris)){
    freq[[i]] <- table(iris[[i]])
}

str(freq)
List of 5
 $ : 'table' int [1:35(1d)] 1 3 1 4 2 5 6 10 9 4 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr [1:35] "4.3" "4.4" "4.5" "4.6" ...
 $ : 'table' int [1:23(1d)] 1 3 4 3 8 5 9 14 10 26 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr [1:23] "2" "2.2" "2.3" "2.4" ...
 $ : 'table' int [1:43(1d)] 1 1 2 7 13 13 7 4 2 1 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr [1:43] "1" "1.1" "1.2" "1.3" ...
 $ : 'table' int [1:22(1d)] 5 29 7 7 1 1 7 3 5 13 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr [1:22] "0.1" "0.2" "0.3" "0.4" ...
 $ : 'table' int [1:3(1d)] 50 50 50
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr [1:3] "setosa" "versicolor" "virginica"

But perhaps using functionals is a better option:

lapply(iris, table)

apply(iris, 2, table)

sapply(iris, table)
  • Related