Home > Software engineering >  How to calculate difference between two groups considering other columns groups
How to calculate difference between two groups considering other columns groups

Time:11-25

I'm struggling on how can I calculate the difference between the measure of the car considering the date1 and the side, i.e., I'd like to obtain the difference between the side A and side B for the same car at the same date1.

Here is a toy example (in my real problem I have several cars and dates...)

test = data.frame(date1= c("20-09-2020", "25-10-2020", "26-10-2020", "27-10-2020", "30-10-2020", "20-09-2020", "25-10-2020", "26-10-2020", "27-10-2020", "30-10-2020"),
                  side = c("A", "A", "A", "A", "A", "B", "B", "B",  "B", "B"), 
                  car = c("46-1", "46-2", "47-1", "46-3", "46-4", "46-1", "46-2", "47-1", "46-3", "46-4"),
                  measure = c(55,34,45,64,13, 52,33,42,60,11))
# test
#         date1 side  car measure
# 1  20-09-2020    A 46-1      55
# 2  25-10-2020    A 46-2      34
# 3  26-10-2020    A 47-1      45
# 4  27-10-2020    A 46-3      64
# 5  30-10-2020    A 46-4      13
# 6  20-09-2020    B 46-1      52
# 7  25-10-2020    B 46-2      33
# 8  26-10-2020    B 47-1      42
# 9  27-10-2020    B 46-3      60
# 10 30-10-2020    B 46-4      11

#I'd like something like
reult
# test
#         date1    car      abs.difference.side
# 1  20-09-2020    46-1      3
# 2  25-10-2020    46-2      1
# 3  26-10-2020    47-1      3
# 4  27-10-2020    46-3      4
# 5  30-10-2020    46-4      2

Maybe I can use something like:

  group_by(date1, car) %>%
  mutate(diference = abs( measure.side.A - measure.side.B))

Any hint on how can I do that?

CodePudding user response:

You can do (assuming you only have side A and side B in your data):

library(tidyverse)

test |> 
  group_by(date1, car) |> 
  summarize(abs.diff = max(measure) - min(measure))

which gives:

# A tibble: 5 x 3
# Groups:   date1 [5]
  date1      car   abs.diff
  <chr>      <chr>    <dbl>
1 20-09-2020 46-1         3
2 25-10-2020 46-2         1
3 26-10-2020 47-1         3
4 27-10-2020 46-3         4
5 30-10-2020 46-4         2

CodePudding user response:

We may use diff with range

library(dplyr)
test %>% 
   group_by(date1, car) %>%
   summarise(abs.diff = diff(range(measure)), .groups = 'drop')

-output

# A tibble: 5 × 3
  date1      car   abs.diff
  <chr>      <chr>    <dbl>
1 20-09-2020 46-1         3
2 25-10-2020 46-2         1
3 26-10-2020 47-1         3
4 27-10-2020 46-3         4
5 30-10-2020 46-4         2
  • Related