Home > Net >  R: Subtract rows from the first row in each group
R: Subtract rows from the first row in each group

Time:06-08

I would like to subtract each rows from the first row in each Group for the column Distance.

My data is the following:

structure(list(Group = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L), Distance = c(0.05, 0.084, 0.06, 0.03, 
0.034, 0.0534, 0.034, 0.23, 0.34, 0.6435, 0.6346, 0.234, 0.246, 
0.4, 0.7)), class = "data.frame", row.names = c(NA, -15L))

I have tried the following code:

Data <- Data %>%
  group_by(Group) %>%
  mutate(Difference=Distance - dplyr:: lag(Distance))

However, how can I adjust the code so that for each Group the value in column Distance is subtracted from the value in the first row for each group? So for group 1 it would mean that the values 0.084, 0.06, 0.03 and 0.034 are all subtracted from 0.05.

CodePudding user response:

Update: see comments: Many thanks to @Gregor Thomas:

library(dplyr)
Data %>% 
  group_by(Group) %>% 
  mutate(Distance = Distance - Distance[1])

Ok now I see it also. It is the same solution as Gregor Thomas provided in the comments just in dplyr way!

Group Distance
   <int>    <dbl>
 1     1   0     
 2     1   0.034 
 3     1   0.0100
 4     1  -0.02  
 5     1  -0.016 
 6     2   0     
 7     2  -0.0194
 8     2   0.177 
 9     2   0.287 
10     2   0.590 
11     3   0     
12     3  -0.401 
13     3  -0.389 
14     3  -0.235 
15     3   0.0654

First answer: Here is another dplyr option:

library(dplyr)
Data %>% 
  group_by(Group) %>% 
  mutate(Distance = Distance - Distance[row_number() == 1])
   Group Distance
   <int>    <dbl>
 1     1   0     
 2     1   0.034 
 3     1   0.0100
 4     1  -0.02  
 5     1  -0.016 
 6     2   0     
 7     2  -0.0194
 8     2   0.177 
 9     2   0.287 
10     2   0.590 
11     3   0     
12     3  -0.401 
13     3  -0.389 
14     3  -0.235 
15     3   0.0654

CodePudding user response:

A possible solution:

library(dplyr)

df %>% 
  group_by(Group) %>% 
  mutate(Distance2 = first(Distance) - Distance) %>% 
  ungroup

#> # A tibble: 15 x 3
#>    Group Distance Distance2
#>    <int>    <dbl>     <dbl>
#>  1     1   0.05      0     
#>  2     1   0.084    -0.034 
#>  3     1   0.06     -0.0100
#>  4     1   0.03      0.02  
#>  5     1   0.034     0.016 
#>  6     2   0.0534    0     
#>  7     2   0.034     0.0194
#>  8     2   0.23     -0.177 
#>  9     2   0.34     -0.287 
#> 10     2   0.644    -0.590 
#> 11     3   0.635     0     
#> 12     3   0.234     0.401 
#> 13     3   0.246     0.389 
#> 14     3   0.4       0.235 
#> 15     3   0.7      -0.0654
  • Related