Home > Software design >  Creating frequencies in the case of creating a table of all possible variables combinations using ti
Creating frequencies in the case of creating a table of all possible variables combinations using ti

Time:08-24

I have this example dataframe in Rstudio :

      mode    sex age_group
1  neutral female    middle
2    happy   male    senior
3   grumpy female    middle
4  neutral female    middle
5   grumpy female    middle
6  neutral female    middle
7   grumpy female    middle
8  neutral female    middle
9  neutral female    middle
10  grumpy female    middle
11 neutral female    middle
12 neutral female    middle
13  grumpy female    middle
14  grumpy female    middle
15  grumpy female    middle
16 neutral female    middle
17  grumpy female    middle
18   happy   male     young
19  grumpy female    middle
20 neutral   male    senior
21 neutral female    middle
22  grumpy female    middle
23  grumpy female    middle
24  grumpy female    middle
25   happy   male     young
26  grumpy female    middle
27 neutral   male    senior
28  grumpy female    middle
29   happy   male    senior
30 neutral female    middle
31  grumpy female    middle
32 neutral female    middle
33 neutral female    middle
34 neutral female    middle
35  grumpy female    middle
36   happy   male    senior
37  grumpy female    middle
38   happy   male    senior
39 neutral   male    senior
40   happy   male     young
41 neutral   male    senior
42  grumpy female    middle
43 neutral   male    senior
44   happy   male     young
45 neutral female    middle
46  grumpy female    middle
47 neutral female    middle
48   happy   male     young
49 neutral   male    senior
50   happy   male    senior

And with the use of tidyr::expand, I was able to create another dataframe with all possible variables combinations as follows :

      mode    sex age_group
1   grumpy female    middle
2   grumpy female    senior
3   grumpy female     young
4   grumpy   male    middle
5   grumpy   male    senior
6   grumpy   male     young
7    happy female    middle
8    happy female    senior
9    happy female     young
10   happy   male    middle
11   happy   male    senior
12   happy   male     young
13 neutral female    middle
14 neutral female    senior
15 neutral female     young
16 neutral   male    middle
17 neutral   male    senior
18 neutral   male     young

However, for the combinations dataframe, I would like to add a column named "Frequencies" that includes the frequency of each combination group of variables (Meaning 18 different frequencies).

Can someone help me make that with a simple function?

Thanks

# the data frame is created as follows

set.seed(111)
mode = sample(c("happy","neutral","grumpy"),
            size = 50,
            replace=TRUE,
            c(0.3,0.3,0.4))

set.seed(111)
sex = sample(c("female","male"),
            size=50,
            replace=TRUE,
            c(0.6,0.4))

set.seed(111)
age_group = sample(c("young","middle","senior"),
            size=50,
            replace=TRUE,
            c(0.2,0.6,0.2))


status = data.frame(mode=mode,
                    sex=sex,
                    age_group=age_group)

CodePudding user response:

With count, you can set .drop = FALSE to include all possible combinations (even if the count is 0):

library(dplyr)

status %>% 
  mutate(across(everything(), factor)) %>% 
  count(mode, sex, age_group, .drop = FALSE)
      mode    sex age_group  n
1   grumpy female    middle 19
2   grumpy female    senior  0
3   grumpy female     young  0
4   grumpy   male    middle  0
5   grumpy   male    senior  0
6   grumpy   male     young  0
7    happy female    middle  0
8    happy female    senior  0
9    happy female     young  0
10   happy   male    middle  0
11   happy   male    senior  5
12   happy   male     young  5
13 neutral female    middle 15
14 neutral female    senior  0
15 neutral female     young  0
16 neutral   male    middle  0
17 neutral   male    senior  6
18 neutral   male     young  0

CodePudding user response:

in BASE r:

data.frame(table(status))
      mode    sex age_group Freq
1   grumpy female    middle   19
2    happy female    middle    0
3  neutral female    middle   15
4   grumpy   male    middle    0
5    happy   male    middle    0
6  neutral   male    middle    0
7   grumpy female    senior    0
8    happy female    senior    0
9  neutral female    senior    0
10  grumpy   male    senior    0
11   happy   male    senior    5
12 neutral   male    senior    6
13  grumpy female     young    0
14   happy female     young    0
15 neutral female     young    0
16  grumpy   male     young    0
17   happy   male     young    5
18 neutral   male     young    0

In Tidyverse

status %>%
  mutate_all(factor) %>%
  table() %>%
  data.frame()

      mode    sex age_group Freq
1   grumpy female    middle   19
2    happy female    middle    0
3  neutral female    middle   15
4   grumpy   male    middle    0
5    happy   male    middle    0
6  neutral   male    middle    0
7   grumpy female    senior    0
8    happy female    senior    0
9  neutral female    senior    0
10  grumpy   male    senior    0
11   happy   male    senior    5
12 neutral   male    senior    6
13  grumpy female     young    0
14   happy female     young    0
15 neutral female     young    0
16  grumpy   male     young    0
17   happy   male     young    5
18 neutral   male     young    0

CodePudding user response:

We could use

library(dplyr)
status %>%
  mutate(across(everything(), factor)) %>% 
  count(across(everything()), .drop = FALSE)

-output

    mode    sex age_group  n
1   grumpy female    middle 19
2   grumpy female    senior  0
3   grumpy female     young  0
4   grumpy   male    middle  0
5   grumpy   male    senior  0
6   grumpy   male     young  0
7    happy female    middle  0
8    happy female    senior  0
9    happy female     young  0
10   happy   male    middle  0
11   happy   male    senior  5
12   happy   male     young  5
13 neutral female    middle 15
14 neutral female    senior  0
15 neutral female     young  0
16 neutral   male    middle  0
17 neutral   male    senior  6
18 neutral   male     young  0
  • Related