In R, sum across rows then summarize with counts and percentages-CodePudding

Background

I've got a dataframe df:

df <- data.frame(task = c("a","b","c", "d","e"),
                 rater_1 = c(1,0,1,0,0),
                 rater_2 = c(1,0,1,1,1),
                 rater_3 = c(1,0,0,0,0),
                 stringsAsFactors=FALSE)

> df
  task rater_1 rater_2 rater_3
1    a       1       1       1
2    b       0       0       0
3    c       1       1       0
4    d       0       1       0
5    e       0       1       0

Raters are given rating tasks about the quality of a product -- if the thing they're rating is of good quality, it gets a 1; if not, it gets a 0.

The problem

I'd like to get R to break down how many of the 5 tasks rated had 3/3 raters mark 1, how many had 2/3 raters mark 1, etc. And to include a percent, too.

I'm looking for something like this:

raters  count  percent
3 of 3  1      20.0      
2 of 3  1      20.0
1 of 3  2      40.0
0 of 3  1      20.0

What I've tried

I've managed to get dplyr to sum across rows, but I can't then collapse it all like I want:

df %>%
  mutate(sum1 = rowSums(across(where(is.numeric))))

  task rater_1 rater_2 rater_3 sum1
1    a       1       1       1    3
2    b       0       0       0    0
3    c       1       1       0    2
4    d       0       1       0    1
5    e       0       1       0    1

I think I'm overthinking things, but I'm running on little sleep and am missing several billion neurons. Thanks.

CodePudding user response：

Perhaps something like this?

library(dplyr)

df <- data.frame(task = c("a", "b", "c", "d", "e"),
                 rater_1 = c(1,0,1,0,0),
                 rater_2 = c(1,0,1,1,1),
                 rater_3 = c(1,0,0,0,0))

df |> 
  mutate(count = rowSums(across(where(is.numeric)))) |> 
  group_by(count) |> 
  summarize(pct = n()/nrow(df))

#> # A tibble: 4 x 2
#>   count   pct
#>   <dbl> <dbl>
#> 1     0   0.2
#> 2     1   0.4
#> 3     2   0.2
#> 4     3   0.2