Home > Software design >  Fill Blank Rows in Dataframe with Sums for each section of data R
Fill Blank Rows in Dataframe with Sums for each section of data R

Time:05-25

I have data that I separated within a dataframe by item description and type. The separations are blank rows but I would like to fill the blank rows in with the sums of numeric values by each description and, if possible, add another blank row below the sums. Preferably, I would not need to sum the sections of data that only contain one row - see variable desc "a" but not a big deal if I do get a sum there.

This is an example of what I have now:

   desc   type  xvalue yvalue
1      a     z     16      1
2
3      b     y     17      2
4      b     y     18      3
5
6      c     x     19      4
7      c     x     20      5
8      c     x     21      6
9
10     d     x    22      7
11     d     x    23      8
12
13     d     y    24      9
14     d     y    25     10

What I am looking for is output that looks similar to this.

  desc   type  xvalue yvalue
1      a     z     16      1
2
3      b     y     17      2
4      b     y     18      3
5                  35      5 
6
7      c     x     19      4
8      c     x     20      5
9      c     x     21      6
10                 40      15
11
12     d     x    22      7
13     d     x    23      8
14                45      15
15
16     d     y    24      9
17     d     y    25     10
18                49     19 

I found an answer on how to do this in a column but not a row. Adding column of summed data by group with empty rows with R

I used acylam's dplyr answer to this question Add blank rows in between existing rows to create the empty rows. I changed the code slightly to fit my data better so my code is:

library(dplyr)
df %>%
  split(df$id, df$group) %>%
  Map(rbind, ., "") %>%
  do.call(rbind, .)

I am hoping I can just add options to the do.call(rbind...) dplyr code I have above.

CodePudding user response:

Depending on how your data is organized we could do it this way:

Assuming empty rows are NA's (if not for example they are blank we can make them NA)

we use group_split() after grouping, getting a list, then iterate with map_df over the list using janitor's adorn_totals


library(dplyr)
library(janitor)
df %>% 
  na.omit() %>% # maybe you don't need this line
  group_by(desc, type) %>% 
  group_split() %>% 
  purrr::map_df(., janitor::adorn_totals) 

 desc type xvalue yvalue
     a    z     16      1
 Total    -     16      1
     b    y     17      2
     b    y     18      3
 Total    -     35      5
     c    x     19      4
     c    x     20      5
     c    x     21      6
 Total    -     60     15
     d    x     22      7
     d    x     23      8
 Total    -     45     15
     d    y     24      9
     d    y     25     10
 Total    -     49     19

data:

structure(list(desc = c("a", NA, "b", "b", NA, "c", "c", "c", 
NA, "d", "d", NA, "d", "d"), type = c("z", NA, "y", "y", NA, 
"x", "x", "x", NA, "x", "x", NA, "y", "y"), xvalue = c(16L, NA, 
17L, 18L, NA, 19L, 20L, 21L, NA, 22L, 23L, NA, 24L, 25L), yvalue = c(1L, 
NA, 2L, 3L, NA, 4L, 5L, 6L, NA, 7L, 8L, NA, 9L, 10L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14"))

CodePudding user response:

Here's a full answer adding empty rows and removing janitor's added stuff from @TarJae's answer:

library(dplyr)
library(janitor)

df <- df %>% 
    na.omit() %>% # maybe you don't need this line
    group_by(desc, type) %>% 
    group_split() %>% 
    purrr::map_df(., \(x) {x <- x %>% janitor::adorn_totals() %>% rbind(NA)}) %>% 
    mutate(
        desc = ifelse(desc == "Total", NA, desc),
        type = ifelse(type == "-", NA, type)
    )
  • Related