i have a data frame that looks like this;
Date Value1 Value 2 Value 3
1997Q1 100 130 120
1997Q1 100 130 124
1997Q1 120 136 154
1997Q2 180 145 154
1997Q2 186 134 126
1997Q2 186 124 176
1997Q3 190 143 176
1997Q3 192 143 123
I would like to calculate differences for each values within the same date, for example the differences in value 1 column for 1997q1, then 1997q2 and so on. I would like these differences to be shown in a new column, so that the results would look something like this;
Date Value1 Value 2 Value 3 Diff Val 1 Diff Val 2 Diff Val 3
1997Q1 100 130 120 0 0 4
1997Q1 100 130 124 20 6 30
1997Q1 120 136 154 N/A N/A N/A
1997Q2 180 145 154 6 -11 -28
1997Q2 186 134 126 0 10 50
1997Q2 186 124 176 N/A N/A N/A
1997Q3 190 143 176 2 0 -53
1997Q3 192 143 123
CodePudding user response:
You can use dplyr functions for this. The ~ .x - lead(.x)
is the function applied to every value column, selected with starts_with
. we take the current value minus the next value. If you need lag
, switch it around, ~ lag(.x) - .x
library(dplyr)
df1 %>%
group_by(Date) %>%
mutate(across(starts_with("Value"), ~.x - lead(.x), .names = "diff_{.col}"))
if the values are numeric and the column names are not easily found, you can use mutate(across(where(is.numeric), ~.x - lead(.x), .names = "diff_{.col}"))
.
# A tibble: 8 × 7
# Groups: Date [3]
Date Value1 Value2 Value3 diff_Value1 diff_Value2 diff_Value3
<chr> <int> <int> <int> <int> <int> <int>
1 1997Q1 100 130 120 0 0 -4
2 1997Q1 100 130 124 -20 -6 -30
3 1997Q1 120 136 154 NA NA NA
4 1997Q2 180 145 154 -6 11 28
5 1997Q2 186 134 126 0 10 -50
6 1997Q2 186 124 176 NA NA NA
7 1997Q3 190 143 176 -2 0 53
8 1997Q3 192 143 123 NA NA NA
data:
df1 <- structure(list(Date = c("1997Q1", "1997Q1", "1997Q1", "1997Q2",
"1997Q2", "1997Q2", "1997Q3", "1997Q3"), Value1 = c(100L, 100L,
120L, 180L, 186L, 186L, 190L, 192L), Value2 = c(130L, 130L, 136L,
145L, 134L, 124L, 143L, 143L), Value3 = c(120L, 124L, 154L, 154L,
126L, 176L, 176L, 123L)), class = "data.frame", row.names = c(NA,
-8L))