Firstly, sorry for my basic question but I couldn't figure it out one thing with my code.
My data frame is like this:
ID <- c("a","b","c","d","e")
age <- c(22,34,55,55,45)
gender <- c("female","male","female","female", "male")
df <- data.frame(ID, age, gender)
df
ID age gender
a 22 female
b 34 male
c 55 female
d 55 female
e 45 male
I simply want to count gender both as frequency and percentages
when I write my code like below, the frequencies become 100%. It does not take the sum score as whole gender distribution but per gender, I guess that's why it gives 100% :
df %>% group_by(gender)%>%
summarise(n = n(), freq = paste0(round(100 * n/sum(n), 0), "%"))
gender n freq
<chr> <int> <chr>
female 3 100%
male 2 100%
I wanted to ask what I am doing wrong.
Thank you so much!
CodePudding user response:
Try breaking them into separate steps:
df %>% group_by(gender) %>%
summarise(n = n()) %>%
mutate(freq = paste0(round(n / sum(n) * 100, 0), "%"))
Output:
# gender n freq
# <chr> <int> <dbl>
# 1 female 3 60%
# 2 male 2 40%
CodePudding user response:
Another solution:
df %>%
group_by(gender) %>%
summarise(n = n(), freq = paste0(round(n/nrow(.) * 100), "%"))
# A tibble: 2 x 3
gender n freq
<chr> <int> <chr>
1 female 3 60%
2 male 2 40%