Home > front end >  Creating a row that aggregates the data for that day
Creating a row that aggregates the data for that day

Time:07-06

I've got a dataframe of the following pattern:

Date     Item            Purchased
01/01/08 Fruit              48
01/01/08 Confectionary      42
01/01/08 Appliance          11
01/06/08 Confectionary      16
01/06/08 Fruit              19
01/06/08 Appliance          50

What I'd like to do is sum by date the total number of items sold on that day, as a row under the 'Item' column, maybe with the name 'Overall' or something. So 01/01/08 will have 4 rows with that date, one titled 'Overall' with '101' in the 'Purchased' column. Thanks for any assistance/advice.

CodePudding user response:

We can use adorn_totals from janitor after grouping by 'Date' which add a new row with the sum from the numeric column

library(dplyr)
library(janitor)
df1 %>%
    group_by(Date) %>%
    group_modify(~ adorn_totals(.x, name = "Overall")) %>%
    ungroup

-output

# A tibble: 8 × 3
  Date     Item          Purchased
  <chr>    <chr>             <int>
1 01/01/08 Fruit                48
2 01/01/08 Confectionary        42
3 01/01/08 Appliance            11
4 01/01/08 Overall             101
5 01/06/08 Confectionary        16
6 01/06/08 Fruit                19
7 01/06/08 Appliance            50
8 01/06/08 Overall              85

data

df1 <- structure(list(Date = c("01/01/08", "01/01/08", "01/01/08", "01/06/08", 
"01/06/08", "01/06/08"), Item = c("Fruit", "Confectionary", "Appliance", 
"Confectionary", "Fruit", "Appliance"), Purchased = c(48L, 42L, 
11L, 16L, 19L, 50L)), class = "data.frame", row.names = c(NA, 
-6L))

CodePudding user response:

Another option where you group_by and use bind_rows and summarise to compute the rows and add them to the dataframe. Should only arrange to have them at the bottom per group like this:

df <- read.table(text="Date     Item            Purchased
01/01/08 Fruit              48
01/01/08 Confectionary      42
01/01/08 Appliance          11
01/06/08 Confectionary      16
01/06/08 Fruit              19
01/06/08 Appliance          50", header=TRUE)

library(dplyr)
df %>%
  group_by(Date) %>%
  bind_rows(summarise(., across(where(is.numeric), sum),
                      across(where(is.character), ~"Overall"))) %>%
  arrange(Date) %>%
  ungroup()
#> # A tibble: 8 × 3
#>   Date     Item          Purchased
#>   <chr>    <chr>             <int>
#> 1 01/01/08 Fruit                48
#> 2 01/01/08 Confectionary        42
#> 3 01/01/08 Appliance            11
#> 4 01/01/08 Overall             101
#> 5 01/06/08 Confectionary        16
#> 6 01/06/08 Fruit                19
#> 7 01/06/08 Appliance            50
#> 8 01/06/08 Overall              85

Created on 2022-07-05 by the reprex package (v2.0.1)

CodePudding user response:

Here is one more tidyverse using add_row:

library(tidyverse)

df1 %>% 
  split(~Date) %>% 
  map_dfr(~ .x %>% 
            add_row(Date = .$Date[1], 
                    Item = "Overall",
                    Purchased = sum(.$Purchased)
                    ))
    Date          Item Purchased
1 01/01/08         Fruit        48
2 01/01/08 Confectionary        42
3 01/01/08     Appliance        11
4 01/01/08       Overall       101
5 01/06/08 Confectionary        16
6 01/06/08         Fruit        19
7 01/06/08     Appliance        50
8 01/06/08       Overall        85

CodePudding user response:

For a base R solution, the aggregate function does exactly this.

aggregate(Purchased~Date, data=df1, FUN=sum)

To integrate it into your existing table, you'll have to add the Item column and rbind, and optionally order by Date.

df1 <- rbind(df1, data.frame(Item="Overall", aggregate(Purchased~Date, data=df1, FUN=sum)))
  • Related