Adding Summarized Fields to Data Frame R-CodePudding

I'm trying to add a summarized field to an existing data frame but I'm having trouble doing so. My data looks like:

df<-data.frame(Category=c("A","A","A","A","A","B","B","B","B","B","C","C","C","C"),Description=c("A.a","A.b","A.c","A.c","A.c","B.a","B.a","B.b","B.c","C.a","C.a","C.b","C.a","C.c"),Amount=c(900,1200,700,900,1000,1500,2000,3000,400,500,1000,50,80,90))

With this dataset I need to first summarize the Amount by Description

df2<-df%>%group_by(Description)%>%dplyr::summarise(Amt=sum(Amount))

Next I would like to summarize (sum) by Category and then include that summary into my df2 data frame.

I know how to do this without using grouping, you could use the within function (i think) but I don't understand how I can include a summary into an existing data frame. The results would include the dataset grouped by Category included as a row. The results would like

df3<-data.frame(Description=c("A.a","A.b","A.c","A","B.a","B.b","B.c","B","C.a","C.b","C.c","C"),Amt=c(900,1200,2600,4700,3500,3000,400,7400,1580,50,90,1220))

CodePudding user response：

You may combine the two summary outputs.

library(dplyr)

bind_rows(df %>% 
           group_by(Description)%>% 
            summarise(Amt=sum(Amount)),
          df %>% 
            group_by(Category)%>% 
            summarise(Amt=sum(Amount)) %>% 
            rename(Description = Category)) %>%
  arrange(Description)

#  Description   Amt
#   <chr>       <dbl>
# 1 A            4700
# 2 A.a           900
# 3 A.b          1200
# 4 A.c          2600
# 5 B            7400
# 6 B.a          3500
# 7 B.b          3000
# 8 B.c           400
# 9 C            1220
#10 C.a          1580
#11 C.b            50
#12 C.c            90

CodePudding user response：

Updated answer:

Here are two approaches. The first one uses bind_rows a bit different (and shorter) than @Ronaks answer. We first mutate the df and then bind it to the original data.frame. Then we can use group_by and summarise. For easy cases like this one the first approach work best.

If you have many more rows you'd like to summarise we can use group_modify together with add_row. In your case we only need to specify the needed columns, but we could also use across for more complex cases with many columns.

library(dplyr)

df %>%
  mutate(Description = Category) %>% 
  bind_rows(df) %>% 
  group_by(Description) %>% 
  summarise(Amt = sum(Amount))

#> # A tibble: 12 x 2
#>    Description   Amt
#>    <chr>       <dbl>
#>  1 A            4700
#>  2 A.a           900
#>  3 A.b          1200
#>  4 A.c          2600
#>  5 B            7400
#>  6 B.a          3500
#>  7 B.b          3000
#>  8 B.c           400
#>  9 C            1220
#> 10 C.a          1580
#> 11 C.b            50
#> 12 C.c            90


df %>% 
  group_by(Category, Description) %>% 
  summarise(Amt = sum(Amount)) %>% 
  group_modify(~ .x %>% 
                 add_row(Description = unlist(.y), 
                         Amt = sum(.x$Amt)))

#> `summarise()` has grouped output by 'Category'. You can override using the `.groups` argument.
#> # A tibble: 13 x 3
#> # Groups:   Category [3]
#>    Category Description   Amt
#>    <chr>    <chr>       <dbl>
#>  1 A        A.a           900
#>  2 A        A.b          1200
#>  3 A        A.c          2600
#>  4 A        A            4700
#>  5 B        B.a          3500
#>  6 B        B.b          3000
#>  7 B        B.c           400
#>  8 B        C.a           500
#>  9 B        B            7400
#> 10 C        C.a          1080
#> 11 C        C.b            50
#> 12 C        C.c            90
#> 13 C        C            1220

^{Created on 2021-09-24 by the reprex package (v2.0.1)}