I've looked through the other problems that have been solved on here, and I can't find anything that quite fits my data and the solution I am looking for yet.
After cleaning my data, I get an output that looks something like this (slimmed down for ease):
Gender | Total |
---|---|
Agender | 9 |
Female | 15 |
Genderfluid | 10 |
Non Binary | 2 |
Male | 14 |
For a summary, I would like the output to look like below, combining the other gender identity categories (not male or female) into one Other gender identity category, however I am uncertain how to combine the rows and sum them. Is there something I could do to combine the gender identity values to create this output below?
Gender | Total |
---|---|
Female | 15 |
Male | 14 |
Other Gender Identity | 21 |
I have tried the following
df <- aggregate(df, list(Group=replace(rownames(df$gender),rownames(df$gender) %in% c("Agender","Genderfluid","Non-binary"), "Transgender")), sum)
which I found in another output, but this isn't working for me, it is just coming up with an error!
CodePudding user response:
With dplyr
df %>%
mutate(gender = ifelse(!gender %in% c("Male", "Female"), "Other", gender)) %>%
group_by(gender) %>%
summarise(Total = sum(total))
# A tibble: 3 × 2
gender Total
<chr> <dbl>
1 Female 15
2 Male 14
3 Other 21
CodePudding user response:
You could create an aggregation group and aggregate (data.table solution below). In the code below it is good to check if you don't have cell values in the gender column which are uncapitalized (e.g. "male" instead of "Male") or something like that. This would lead them to be added to the "Other Gender Identity" group.
#load library
library(data.table)
#example data
dt <- data.table(gender = c("Agender", "Female", "Genderfluid", "Non Binary", "Male"),
Total = c(9,15,10,2,14))
#create group for aggregation
dt[!gender %in% c("Female", "Male"), group := "Other Gender Identity"]
dt[gender %in% c("Female", "Male"), group := gender]
#aggregate
dt[, sum(Total), by = group]
dt
group V1
1: Other Gender Identity 21
2: Female 15
3: Male 14
CodePudding user response:
Using replace and aggregate:
with(df,
aggregate(x = Total,
by = list(Group = replace(Gender, !Gender %in% c("Male", "Female"), "Other")),
FUN = "sum"))
# Group x
# 1 Female 15
# 2 Male 14
# 3 Other 21