I have a large data frame that includes data on the same people over time I want to group each person by years of 2. The result should sum up one column but average the other because a straight average won't work. I can do this in excel however not in R. I included sample data and desired output.
Name <- c(rep("Joe",3),rep("Kenny",3))
Age <- c(20,21,22,19,20,21)
Year <- c(rep(c(2010,2011,2012),length.out = 6))
Jumps <- c(10,150,200,300,20,90)
Success <- c(0.5,0.6,0.7,0.4,0.5,0.6)
df <- data.frame(Name,Age,Year,Jumps,Success)
Name Age Year Jumps Success
1 Joe 20 2010 10 0.5
2 Joe 21 2011 150 0.6
3 Joe 22 2012 200 0.7
4 Kenny 19 2010 300 0.4
5 Kenny 20 2011 20 0.5
6 Kenny 21 2012 90 0.6
Desired Output
CodePudding user response:
Update: Complete solution with success column (removed prior solution):
Main feature is to use lead
:
library(dplyr)
df %>%
group_by(Name) %>%
mutate(Success = (Success*Jumps),
Success = Success lead(Success),
Age= paste(Age, lead(Age), sep = "-"),
Jumps = Jumps lead(Jumps),
Year = paste(Year, lead(Year), sep = "-"),
Success = Success/Jumps) %>%
slice(-n())
Name Age Year Jumps Success
<chr> <chr> <chr> <dbl> <dbl>
1 Joe 20-21 2010-2011 160 0.594
2 Joe 21-22 2011-2012 350 0.657
3 Kenny 19-20 2010-2011 320 0.406
4 Kenny 20-21 2011-2012 110 0.582