Home > OS >  Sequential aggregation with a single call of pivot_wider
Sequential aggregation with a single call of pivot_wider

Time:12-10

Consider the dataframe:

df <- data.frame(x = c(1,2,1,1), y = c("a", "a", "b", "a"))

By applying the code below

library(tidyverse)

df %>% 
  pivot_wider(x, names_from = y, values_from = y, values_fn = length, names_prefix = "tot_", values_fill = 0) %>% 
  mutate(per_a = 100*tot_a / rowSums(select(.,starts_with("tot_")))) %>% 
  mutate(per_b = 100*tot_b / rowSums(select(.,starts_with("tot_"))))

one gets the result

   <dbl> <int> <int> <dbl> <dbl>
 1     1     2     1  66.7  33.3
 2     2     1     0 100     0

My question is: Is it possible to get the same result using a single call of pivot_wider, that is, without any mutate?

CodePudding user response:

You'll need to group_by twice I think if you want to do the pivot_wider once without mutate-ing each and every percentage column separately.

df %>%
  group_by(x,y) %>%
  count(name="tot") %>%
  group_by(x) %>%
  mutate(per = tot / sum(tot)) %>%
  pivot_wider(id_cols = x, names_from=y, values_from=c(tot,per))

## A tibble: 2 x 5
## Groups:   x [2]
#      x tot_a tot_b per_a  per_b
#  <dbl> <int> <int> <dbl>  <dbl>
#1     1     2     1 0.667  0.333
#2     2     1    NA 1     NA   

That would be 'tidy' to me, in that you are doing all the calculations in 2 grouped sweeps in the long-form, tidy data, rather than trying to manually select multiple columns in wide format.

  • Related