Home > Net >  How to count and group multiple columns in R dataframe?
How to count and group multiple columns in R dataframe?

Time:01-11

really basic question... I have a dataframe like the one below, where the numbers indicate a score:

df<-data.frame(A=c(1,2,1,1,3,3,2,2),B=c(2,2,2,3,2,3,3,1),C=c(1,1,1,1,1,2,2,3))

enter image description here

And I would like to change it to this format to plot it in a stacked bar chart:

enter image description here

I know how to do it in a very roundabout and probably overly complicated way, and any suggestions on a more "streamlined" way to do it would be very welcome! Thanks in advance!

CodePudding user response:

library(tidyverse)

df %>% 
  pivot_longer(everything(), names_to = "Score") %>% 
  count(Score, value, name = "Freq")

# A tibble: 9 × 3
  Score value  Freq
  <chr> <dbl> <int>
1 A         1     3
2 A         2     3
3 A         3     2
4 B         1     1
5 B         2     4
6 B         3     3
7 C         1     5
8 C         2     2
9 C         3     1

CodePudding user response:

The dplyr solutions are likely more scalable, but an alternative base R approach: use do.call along with lapply and table then put it all in a data.frame:

data.frame(Name = rep(c("A", "B", "C"), each = 3),
      Score = rep(1:3, each = 3),
      Frequency = do.call(c, lapply(df[], table)))

#     Name Score Frequency
# A.1    A     1         3
# A.2    A     1         3
# A.3    A     1         2
# B.1    B     2         1
# B.2    B     2         4
# B.3    B     2         3
# C.1    C     3         5
# C.2    C     3         2
# C.3    C     3         1

CodePudding user response:

We can turn the data into long format and then calculate frequency

df%>%
  gather(Name,Score,A:C)%>%
  group_by(Name,Score)%>%
  summarise(Frequency=n())%>%
  ungroup

  Name  Score Frequency
  <chr> <dbl>     <int>
1 A         1         3
2 A         2         3
3 A         3         2
4 B         1         1
5 B         2         4
6 B         3         3
7 C         1         5
8 C         2         2
9 C         3         1

CodePudding user response:

Using base R

 as.data.frame(table(stack(df)[2:1]))
  ind values Freq
1   A      1    3
2   B      1    1
3   C      1    5
4   A      2    3
5   B      2    4
6   C      2    2
7   A      3    2
8   B      3    3
9   C      3    1
  • Related