Home > Enterprise >  Sum values based on date
Sum values based on date

Time:10-29

I have a data.frame in R containing data collected over different sites (geographical locations) over several years. I'd like to be able to get a cumulative figure for each year, by combining all the sites together as a 'Total', with the 'Total' falling under the year column in this example.

Here's an example

year<- rep(c(2010, 2011, 2012, 2013), 4)
type<- rep(c('A', 'B', 'C', 'D'),each = 4)
x <- floor(runif(16, 10, 25))
data<-data.frame(year, type,x)

data



year type  x
1  2010    A 22
2  2011    A 22
3  2012    A 11
4  2013    A 13
5  2010    B 23
6  2011    B 22
7  2012    B 12
8  2013    B 10
9  2010    C 22
10 2011    C 17
11 2012    C 14
12 2013    C 10
13 2010    D 23
14 2011    D 22
15 2012    D 19
16 2013    D 23

Here's what I'd like to look like

year type  x
1  2010    A 22
2  2011    A 22
3  2012    A 11
4  2013    A 13
5  Total   A 68
6  2010    B 23
7  2011    B 22
8  2012    B 12
9  2013    B 10
10 Total   B 67
11 2010    C 22
12 2011    C 17
13 2012    C 14
14 2013    C 10
15 Total   C 63
16 2010    D 23
17 2011    D 22
18 2012    D 19
19 2013    D 23
20 Total   D 87

Any help greatly appreciated.

Thanks

CodePudding user response:

you may try

library(dplyr)

data2 <- data %>%
  group_by(type) %>%
  summarise(x = sum(x)) %>%
  mutate(year = "Total")

rbind(data, data2) %>%
  arrange(type, year)

    year type  x
1   2010    A 18
2   2011    A 23
3   2012    A 22
4   2013    A 16
5  Total    A 79
6   2010    B 22
7   2011    B 17
8   2012    B 16
9   2013    B 18
10 Total    B 73
11  2010    C 19
12  2011    C 11
13  2012    C 16
14  2013    C 12
15 Total    C 58
16  2010    D 10
17  2011    D 20
18  2012    D 24
19  2013    D 21
20 Total    D 75

CodePudding user response:

Another solution based on purrr::map_dfr:

library(tidyverse)

df <- data.frame(
  stringsAsFactors = FALSE,
                    year = c(2010L,2011L,2012L,
                             2013L,2010L,2011L,2012L,2013L,2010L,2011L,
                             2012L,2013L,2010L,2011L,2012L,2013L),
                    type = c("A","A","A","A",
                             "B","B","B","B","C","C","C","C","D","D",
                             "D","D"),
                       x = c(22L,22L,11L,13L,
                             23L,22L,12L,10L,22L,17L,14L,10L,23L,22L,
                             19L,23L)
      )

df %>% 
  group_split(type) %>% 
  map_dfr(~ rbind(.x,data.frame(
    year="Total", type=unique(.x$type), x=sum(.x$x))))
#> # A tibble: 20 × 3
#>    year  type      x
#>    <chr> <chr> <int>
#>  1 2010  A        22
#>  2 2011  A        22
#>  3 2012  A        11
#>  4 2013  A        13
#>  5 Total A        68
#>  6 2010  B        23
#>  7 2011  B        22
#>  8 2012  B        12
#>  9 2013  B        10
#> 10 Total B        67
#> 11 2010  C        22
#> 12 2011  C        17
#> 13 2012  C        14
#> 14 2013  C        10
#> 15 Total C        63
#> 16 2010  D        23
#> 17 2011  D        22
#> 18 2012  D        19
#> 19 2013  D        23
#> 20 Total D        87
  •  Tags:  
  • r sum
  • Related