Background
I've got a dataframe df
:
df <- data.frame(task = c("a","b","c", "d","e"),
rater_1 = c(1,0,1,0,0),
rater_2 = c(1,0,1,1,1),
rater_3 = c(1,0,0,0,0),
stringsAsFactors=FALSE)
> df
task rater_1 rater_2 rater_3
1 a 1 1 1
2 b 0 0 0
3 c 1 1 0
4 d 0 1 0
5 e 0 1 0
Raters are given rating tasks about the quality of a product -- if the thing they're rating is of good quality, it gets a 1
; if not, it gets a 0
.
The problem
I'd like to get R
to break down how many of the 5 tasks rated had 3/3 raters mark 1
, how many had 2/3 raters mark 1
, etc. And to include a percent, too.
I'm looking for something like this:
raters count percent
3 of 3 1 20.0
2 of 3 1 20.0
1 of 3 2 40.0
0 of 3 1 20.0
What I've tried
I've managed to get dplyr to sum across rows, but I can't then collapse it all like I want:
df %>%
mutate(sum1 = rowSums(across(where(is.numeric))))
task rater_1 rater_2 rater_3 sum1
1 a 1 1 1 3
2 b 0 0 0 0
3 c 1 1 0 2
4 d 0 1 0 1
5 e 0 1 0 1
I think I'm overthinking things, but I'm running on little sleep and am missing several billion neurons. Thanks.
CodePudding user response:
Perhaps something like this?
library(dplyr)
df <- data.frame(task = c("a", "b", "c", "d", "e"),
rater_1 = c(1,0,1,0,0),
rater_2 = c(1,0,1,1,1),
rater_3 = c(1,0,0,0,0))
df |>
mutate(count = rowSums(across(where(is.numeric)))) |>
group_by(count) |>
summarize(pct = n()/nrow(df))
#> # A tibble: 4 x 2
#> count pct
#> <dbl> <dbl>
#> 1 0 0.2
#> 2 1 0.4
#> 3 2 0.2
#> 4 3 0.2