how to ungroup and mutate results to the original dataset after summing up values, R-CodePudding

I want to combine two compound commands from package "dplyr" for simplicity.

this is a hypothetical dataset

V5	V15	sum	length	density
upstream	g1	1234	17645	0.1
upstream	g2	3456	17645	0.3
downstream	g1	2345	17645	0.2
downstream	g2	1456	17645	0.1

I first get the total length of each region:

df %>% dplyr::group_by(V5) %>% 
  dplyr::summarize(sum(sum)) %>% 
  ungroup()

then manually add it to a new column and extra:

df= df %>% mutate("region" = case_when(
    str_detect(V5, "upstream") ~ "4690",
    str_detect(V5, "downstream") ~ "3801",
))

df$Gsize <- (as.numeric(df$region)/14675549)*100

the function ungroup() doesn't do what I expected, I want the summed value be added for all variables. how can I combine the first and second functions in a way that it automatically calculates each region's size, adds it to a new column so then I can get the percentage of it? it is tedious to be done manually for many regions and many tables.

expected result:

V5	V15	sum	length	density	region
upstream	g1	1234	17645	0.1	4690
upstream	g2	3456	17645	0.3	4690
downstream	g1	2345	17645	0.2	3801
downstream	g2	1456	17645	0.1	3801

CodePudding user response：

After computing the totals, join the totals with the original dataset. Then you can proceed with your percentage calculation.

library(dplyr)

 df %>%
  group_by(V5) %>% 
  summarize(total = sum(sum)) %>% 
  left_join(df, by = "V5")