This is the code used to derive the first table in my question.
JH %>% group_by(ATT_ID, CAR=="B") %>%
summarize(count = n(), .groups = "drop")
ATT_ID | CAR == "B" | Count |
---|---|---|
ONE | FALSE | 1 |
TWO | TRUE | 1 |
THREE | TRUE | 3 |
THREE | FALSE | 5 |
FOUR | FALSE | 2 |
FIVE | TRUE | 4 |
SIX | TRUE | 8 |
SIX | FALSE | 4 |
How can I get the table above to look like:
ATT_ID | Percentage of "B" |
---|---|
ONE | 0% |
TWO | 100% |
THREE | 37.5% |
FOUR | 0% |
FIVE | 100% |
SIX | 67% |
- Notice how some ID's are seen twice so as to show the presence of both FALSE & TRUE whereas other ID's appear once to showcase the presence of only one or the other.
Thank you
CodePudding user response:
Grouped by 'ATT_ID', get the sum
of Count
where CAR=="B"
is TRUE and divide by the sum
of full Count
library(dplyr)
df1 %>%
group_by(ATT_ID = factor(ATT_ID, levels = unique(ATT_ID))) %>%
summarise(Percentage_of_B = paste0(round(
sum(Count[`CAR == "B"`])/sum(Count) * 100, 1), "%"))
-output
# A tibble: 6 × 2
ATT_ID Percentage_of_B
<fct> <chr>
1 ONE 0%
2 TWO 100%
3 THREE 37.5%
4 FOUR 0%
5 FIVE 100%
6 SIX 66.7%
data
df1 <- structure(list(ATT_ID = c("ONE", "TWO", "THREE", "THREE", "FOUR",
"FIVE", "SIX", "SIX"), `CAR == "B"` = c(FALSE, TRUE, TRUE, FALSE,
FALSE, TRUE, TRUE, FALSE), Count = c(1L, 1L, 3L, 5L, 2L, 4L,
8L, 4L)), class = "data.frame", row.names = c(NA, -8L))