I have this data frame
df.bar <- data.frame(diagnosis = c("A","A","A", "nb" ,"nb", "hg"),
C1 = c(1,1,0,0,1,0), C2 = c(0,1,0,0,0,0))
df.bar
diagnosis C1 C2
1 A 1 0
2 A 1 1
3 A 0 0
4 nb 0 0
5 nb 1 0
6 hg 0 0
I want to calculate the percentage of "one" for each diagnosis as follows:
diagnosis C1 C2
1 A 66% 33%
2 nb 50% 0%
3 hg 0% 0%
CodePudding user response:
base
solution withaggregate()
:
aggregate(cbind(C1, C2) ~ diagnosis, df.bar,
\(x) paste0(round(mean(x) * 100, 2), '%'))
dplyr
solution:
library(dplyr)
df.bar %>%
group_by(diagnosis) %>%
summarise(across(C1:C2, ~ paste0(round(mean(.x) * 100, 2), '%')))
# # A tibble: 3 × 3
# diagnosis C1 C2
# <chr> <chr> <chr>
# 1 A 66.67% 33.33%
# 2 hg 0% 0%
# 3 nb 50% 0%
CodePudding user response:
Dplyr answer:
library(dplyr)
df.bar |> group_by(diagnosis) |> summarise(C1 = sum(C1) / n() * 100,
C2 = sum(C2) / n() * 100)
CodePudding user response:
library(tidyverse)
df.bar %>%
group_by(diagnosis) %>%
summarise(
C1 = str_c(round(sum(C1)/n()*100,2), "%"),
C2 = str_c(round(sum(C2)/n()*100,2), "%")
)
# A tibble: 3 × 3
diagnosis C1 C2
<chr> <chr> <chr>
1 A 66.67% 33.33%
2 hg 0% 0%
3 nb 50% 0%