Home > Software engineering >  R: I'm trying to group rows by name and year to create a sum output and avg output
R: I'm trying to group rows by name and year to create a sum output and avg output

Time:07-02

I have a large data frame that includes data on the same people over time I want to group each person by years of 2. The result should sum up one column but average the other because a straight average won't work. I can do this in excel however not in R. I included sample data and desired output.

Name <- c(rep("Joe",3),rep("Kenny",3))
Age <- c(20,21,22,19,20,21)
Year <- c(rep(c(2010,2011,2012),length.out = 6))
Jumps <- c(10,150,200,300,20,90)
Success <- c(0.5,0.6,0.7,0.4,0.5,0.6)
df <- data.frame(Name,Age,Year,Jumps,Success)
   Name Age Year Jumps Success
1   Joe  20 2010    10     0.5
2   Joe  21 2011   150     0.6
3   Joe  22 2012   200     0.7
4 Kenny  19 2010   300     0.4
5 Kenny  20 2011    20     0.5
6 Kenny  21 2012    90     0.6

Desired Output

Desired Output

CodePudding user response:

Update: Complete solution with success column (removed prior solution):

Main feature is to use lead:

library(dplyr)

df %>% 
  group_by(Name) %>% 
  mutate(Success = (Success*Jumps),
         Success = Success   lead(Success),
         Age= paste(Age, lead(Age), sep = "-"),
         Jumps = Jumps   lead(Jumps), 
         Year = paste(Year, lead(Year), sep = "-"),
         Success = Success/Jumps) %>% 
  slice(-n())
 Name  Age   Year      Jumps Success
  <chr> <chr> <chr>     <dbl>   <dbl>
1 Joe   20-21 2010-2011   160   0.594
2 Joe   21-22 2011-2012   350   0.657
3 Kenny 19-20 2010-2011   320   0.406
4 Kenny 20-21 2011-2012   110   0.582
  •  Tags:  
  • r
  • Related