Home > Mobile >  Counting the frequency of several categorical variables in R
Counting the frequency of several categorical variables in R

Time:04-29

I need to create a data frame containing the frequency of each categorical variable from a previous data frame. Fortunately, these variables are all structured with numbers, from 1 to 5, instead of texts.

Therefore, I could create a new data frame with a first column containing the numbers 1 to 5, and each following column counting the frequency of that number as the response for each variable in the original data frame.

For example, we have an original df defined as:

df1 <- data.frame(
             Z = c(4,   1,  2,  1,  5,  4,  2,  5,  1,  5),
             Y = c(5,   1,  5,  5,  2,  1,  4,  1,  3,  3),
             X = c(4,   2,  2,  1,  5,  1,  5,  1,  3,  2),
             W = c(2,   1,  4,  2,  3,  2,  4,  2,  1,  2),
             V = c(5,   1,  3,  3,  3,  3,  2,  4,  4,  1))

I would need a second df containing the following table:

fq  Z   Y   X   W   V
1   3   3   3   2   2
2   4   2   6   10  2
3   0   6   3   3   12
4   8   4   4   8   8
5   15  15  10  0   5

I saw some answers of how to do smething like this using plyr, but not in a systematic way. Can someone help me out?

CodePudding user response:

We may use

sapply(df1, function(x) tapply(x, factor(x, levels = 1:5),  FUN = sum))
   Z  Y  X  W  V
1  3  3  3  2  2
2  4  2  6 10  2
3 NA  6  3  3 12
4  8  4  4  8  8
5 15 15 10 NA  5

CodePudding user response:

 table(stack(df1)) * 1:5

    ind
values  Z  Y  X  W  V
     1  3  3  3  2  2
     2  4  2  6 10  2
     3  0  6  3  3 12
     4  8  4  4  8  8
     5 15 15 10  0  5

If you need a data.frame, you could do:

  as.data.frame.matrix(table(stack(df1)) * 1:5)

CodePudding user response:

Another possible solution, based on purrr::map_dfc:

library(tidyverse)

map_dfc(df1, ~ 1:5 * table(factor(.x, levels = 1:5)) %>% as.vector) 

#> # A tibble: 5 × 5
#>       Z     Y     X     W     V
#>   <int> <int> <int> <int> <int>
#> 1     3     3     3     2     2
#> 2     4     2     6    10     2
#> 3     0     6     3     3    12
#> 4     8     4     4     8     8
#> 5    15    15    10     0     5
  • Related