I would like to obtain a cumulative sum through years and for each entity.
Here below an example, and what I have tried so far:
data<-data.frame(id=c("a","a","a","b","a","b","b"),cars=c(1,1,1,1,1,1,1),year=c(2001,2001,2002,2003,2003,2003,2004))
I have tried :
library(tidyverse)
library(stringr)
library(dplyr)
library(tidyr)
data %>% group_by(id,year) %>% mutate(csum = (cumsum(cars))) %>% top_n(1, csum)
id cars year csum
<chr> <dbl> <dbl> <dbl>
1 a 1 2001 2
2 a 1 2002 1
3 a 1 2003 1
4 b 1 2003 2
5 b 1 2004 1
This is what I would like:
id cars year csum
<chr> <dbl> <dbl> <dbl>
1 a 1 2001 2
2 a 1 2002 3
3 a 1 2003 4
4 b 1 2003 2
5 b 1 2004 3
Thank you very much.
CodePudding user response:
dat %>% group_by(id) %>% count(cars,year) %>%
mutate(csum = cumsum(n)) %>% select(-c(cars,n))
# A tibble: 5 × 3
# Groups: id [2]
id year csum
<chr> <dbl> <int>
1 a 2001 2
2 a 2002 3
3 a 2003 4
4 b 2003 2
5 b 2004 3
CodePudding user response:
The function you need is aggregate
, you can use it this way :
aggregate(x=data$cars, #column on which you want to apply a function
by=list(id=data$id,year=data$year),#grouping variables
FUN=sum)#function to apply
CodePudding user response:
Does this work?
data %>% group_by(id, year) %>% summarise(sum_cars = sum(cars)) %>%
group_by(id) %>% mutate(csum_cars = cumsum(sum_cars), .keep = 'unused')
Output:
# Groups: id [2]
id year csum_cars
<fct> <dbl> <dbl>
1 a 2001 2
2 a 2002 3
3 a 2003 4
4 b 2003 2
5 b 2004 3