Home > OS >  How can I merge two rows together and sum them?
How can I merge two rows together and sum them?

Time:01-12

I've looked through the other problems that have been solved on here, and I can't find anything that quite fits my data and the solution I am looking for yet.

After cleaning my data, I get an output that looks something like this (slimmed down for ease):

Gender Total
Agender 9
Female 15
Genderfluid 10
Non Binary 2
Male 14

For a summary, I would like the output to look like below, combining the other gender identity categories (not male or female) into one Other gender identity category, however I am uncertain how to combine the rows and sum them. Is there something I could do to combine the gender identity values to create this output below?

Gender Total
Female 15
Male 14
Other Gender Identity 21

I have tried the following

df <- aggregate(df, list(Group=replace(rownames(df$gender),rownames(df$gender) %in% c("Agender","Genderfluid","Non-binary"), "Transgender")), sum)

which I found in another output, but this isn't working for me, it is just coming up with an error!

CodePudding user response:

With dplyr

df %>% 
  mutate(gender = ifelse(!gender %in% c("Male", "Female"), "Other", gender)) %>% 
   group_by(gender) %>% 
   summarise(Total = sum(total))

    # A tibble: 3 × 2
      gender Total
      <chr>  <dbl>
    1 Female    15
    2 Male      14
    3 Other     21

CodePudding user response:

You could create an aggregation group and aggregate (data.table solution below). In the code below it is good to check if you don't have cell values in the gender column which are uncapitalized (e.g. "male" instead of "Male") or something like that. This would lead them to be added to the "Other Gender Identity" group.

#load library
library(data.table)

#example data
dt <- data.table(gender = c("Agender", "Female", "Genderfluid", "Non Binary", "Male"),
                 Total = c(9,15,10,2,14))

#create group for aggregation
dt[!gender %in% c("Female", "Male"), group := "Other Gender Identity"]
dt[gender %in% c("Female", "Male"), group := gender]

#aggregate
dt[, sum(Total), by = group]
dt

                   group V1
1: Other Gender Identity 21
2:                Female 15
3:                  Male 14

CodePudding user response:

Using replace and aggregate:

with(df,
     aggregate(x = Total, 
               by = list(Group = replace(Gender, !Gender %in% c("Male", "Female"), "Other")), 
               FUN = "sum"))
#     Group  x
# 1  Female 15
# 2    Male 14
# 3   Other 21
  •  Tags:  
  • r
  • Related