Home > OS >  In R , how to summarize data frame in multiple dimensions
In R , how to summarize data frame in multiple dimensions

Time:06-19

There is dataframe raw_data as below, How can i change it to wished_data in easy way ?

I currently know group_by/summarise the data serval times (and add variables) , then rbind them. But this is little boring , especially when variables more then this example in occasion.

I want to know ,if is there any general method for similar situation ? Thanks!

library(tidyverse)
country <- c('UK','US','UK','US')
category <- c("A", "B", "A", "B")
y2021 <- c(17, 42, 21, 12)
y2022 <- c(49, 23, 52, 90)

raw_data <- data.frame(country,category,y2021,y2022)

enter image description here

CodePudding user response:

We may use rollup/cube/groupingsets from data.table

library(data.table)
out <- rbind(setDT(raw_data), groupingsets(raw_data, j = lapply(.SD, sum), 
  by = c("country", "category"), 
    sets = list("country", "category", character())))
out[is.na(out)] <- 'TOTAL'

-output

> out
   country category y2021 y2022
    <char>   <char> <num> <num>
1:      UK        A    17    49
2:      US        B    42    23
3:      UK        A    21    52
4:      US        B    12    90
5:      UK    TOTAL    38   101
6:      US    TOTAL    54   113
7:   TOTAL        A    38   101
8:   TOTAL        B    54   113
9:   TOTAL    TOTAL    92   214

Or with cube

out <- rbind(raw_data, cube(raw_data, 
  j = .(y2021= sum(y2021), y2022=sum(y2022)), by = c("country", "category")))
out[is.na(out)] <- 'TOTAL'

CodePudding user response:

We can use the adorn_totals function from janitor like this:

library(dplyr)
library(janitor)

get_totals <- function(data, col) {
  data %>%
    group_by({{col}}) %>%
    group_modify(~ adorn_totals(.)) %>%
    ungroup %>%
    filter(rowSums(across() == "Total") > 0) %>%
    select(any_of(names(data)))
}

bind_rows(
  raw_data,
  get_totals(raw_data, category), 
  get_totals(raw_data, country), 
  get_totals(raw_data)
)

giving:

  country category y2021 y2022
1      UK        A    17    49
2      US        B    42    23
3      UK        A    21    52
4      US        B    12    90
5   Total        A    38   101
6   Total        B    54   113
7      UK    Total    38   101
8      US    Total    54   113
9   Total        -    92   214
  • Related