Home > Blockchain >  how to ungroup and mutate results to the original dataset after summing up values, R
how to ungroup and mutate results to the original dataset after summing up values, R

Time:06-24

I want to combine two compound commands from package "dplyr" for simplicity.

this is a hypothetical dataset

V5 V15 sum length density
upstream g1 1234 17645 0.1
upstream g2 3456 17645 0.3
downstream g1 2345 17645 0.2
downstream g2 1456 17645 0.1

I first get the total length of each region:

df %>% dplyr::group_by(V5) %>% 
  dplyr::summarize(sum(sum)) %>% 
  ungroup()

then manually add it to a new column and extra:

df= df %>% mutate("region" = case_when(
    str_detect(V5, "upstream") ~ "4690",
    str_detect(V5, "downstream") ~ "3801",
))

df$Gsize <- (as.numeric(df$region)/14675549)*100

the function ungroup() doesn't do what I expected, I want the summed value be added for all variables. how can I combine the first and second functions in a way that it automatically calculates each region's size, adds it to a new column so then I can get the percentage of it? it is tedious to be done manually for many regions and many tables.

expected result:

V5 V15 sum length density region
upstream g1 1234 17645 0.1 4690
upstream g2 3456 17645 0.3 4690
downstream g1 2345 17645 0.2 3801
downstream g2 1456 17645 0.1 3801

CodePudding user response:

After computing the totals, join the totals with the original dataset. Then you can proceed with your percentage calculation.

library(dplyr)

 df %>%
  group_by(V5) %>% 
  summarize(total = sum(sum)) %>% 
  left_join(df, by = "V5")

  • Related