Home > Software engineering >  Summarise multiple but not all column
Summarise multiple but not all column

Time:09-20

I have a dataset with 51 columns and I want to add summary rows for most of these variables. Currently columns 5:48 are various metrics with each row being 1 area from 1 quarter, I am summing the metric for all quarters for each area and ultimately creating a rate. The below code works fine for doing this to one individual metric but I need to run this for 44 different columns.

example <- test %>%
  group_by(Area) %>%
  summarise(`Metric 1`= (sum(`Metric 1`))/(mean(Population))*10000) %>%
  bind_rows(test) %>%
  arrange(Area,-Quarter)%>% # sort so that total rows come after the quarters
  mutate(Timeframe=if_else(is.na(Quarter),'12 month rolling', 'Quarterly')) 

I have tried creating a for loop and using the column index values, however, that hasn't worked and just returns various errors. I've been unable to get the above script working with index values as well, the below gives an error ('Error: unexpected '=' in: " group_by_at(Local_Authority) %>% summarise(u17_12mo[5]=")

example <- test %>%
  group_by_at(Area) %>%
  summarise(test[5]= (sum(test[5]))/(mean(test[4]))*10000) %>%
  bind_rows(test) %>%
  arrange(Area,-Quarter)%>% # sort so that total rows come after the quarters
  mutate(Timeframe=if_else(is.na(Quarter),'12 month rolling', 'Quarterly')) 

Any help on setting up a for loop for this, or another way entirely would be great

CodePudding user response:

Without data, its tough to help, but maybe this would work for you:

library(tidyverse)

example <- test %>%
  group_by(Area) %>%
  summarise(across(5:48, ~(sum(.))/(mean(Population))*10000))

  •  Tags:  
  • r
  • Related