There is dataframe raw_data
as below, How can i change it to wished_data
in easy way ?
I currently know group_by/summarise
the data serval times (and add variables) , then rbind
them. But this is little boring , especially when variables more then this example in occasion.
I want to know ,if is there any general method for similar situation ? Thanks!
library(tidyverse)
country <- c('UK','US','UK','US')
category <- c("A", "B", "A", "B")
y2021 <- c(17, 42, 21, 12)
y2022 <- c(49, 23, 52, 90)
raw_data <- data.frame(country,category,y2021,y2022)
CodePudding user response:
We may use rollup/cube/groupingsets
from data.table
library(data.table)
out <- rbind(setDT(raw_data), groupingsets(raw_data, j = lapply(.SD, sum),
by = c("country", "category"),
sets = list("country", "category", character())))
out[is.na(out)] <- 'TOTAL'
-output
> out
country category y2021 y2022
<char> <char> <num> <num>
1: UK A 17 49
2: US B 42 23
3: UK A 21 52
4: US B 12 90
5: UK TOTAL 38 101
6: US TOTAL 54 113
7: TOTAL A 38 101
8: TOTAL B 54 113
9: TOTAL TOTAL 92 214
Or with cube
out <- rbind(raw_data, cube(raw_data,
j = .(y2021= sum(y2021), y2022=sum(y2022)), by = c("country", "category")))
out[is.na(out)] <- 'TOTAL'
CodePudding user response:
We can use the adorn_totals function from janitor like this:
library(dplyr)
library(janitor)
get_totals <- function(data, col) {
data %>%
group_by({{col}}) %>%
group_modify(~ adorn_totals(.)) %>%
ungroup %>%
filter(rowSums(across() == "Total") > 0) %>%
select(any_of(names(data)))
}
bind_rows(
raw_data,
get_totals(raw_data, category),
get_totals(raw_data, country),
get_totals(raw_data)
)
giving:
country category y2021 y2022
1 UK A 17 49
2 US B 42 23
3 UK A 21 52
4 US B 12 90
5 Total A 38 101
6 Total B 54 113
7 UK Total 38 101
8 US Total 54 113
9 Total - 92 214