Home > Blockchain >  Creating differences in a new column for certain dates in R
Creating differences in a new column for certain dates in R

Time:08-10

i have a data frame that looks like this;

Date    Value1  Value 2 Value 3
1997Q1  100 130     120
1997Q1  100 130     124
1997Q1  120 136     154
1997Q2  180 145     154
1997Q2  186 134     126
1997Q2  186 124     176
1997Q3  190 143     176
1997Q3  192 143     123

I would like to calculate differences for each values within the same date, for example the differences in value 1 column for 1997q1, then 1997q2 and so on. I would like these differences to be shown in a new column, so that the results would look something like this;

Date    Value1  Value 2 Value 3 Diff Val 1  Diff Val 2  Diff Val 3
1997Q1  100     130     120     0          0           4
1997Q1  100     130     124     20         6           30
1997Q1  120     136     154     N/A        N/A         N/A
1997Q2  180     145     154     6          -11         -28
1997Q2  186     134     126     0          10          50
1997Q2  186     124     176     N/A        N/A         N/A
1997Q3  190     143     176     2          0           -53
1997Q3  192     143     123

CodePudding user response:

You can use dplyr functions for this. The ~ .x - lead(.x) is the function applied to every value column, selected with starts_with. we take the current value minus the next value. If you need lag, switch it around, ~ lag(.x) - .x

library(dplyr)

df1 %>% 
  group_by(Date) %>% 
  mutate(across(starts_with("Value"), ~.x - lead(.x), .names = "diff_{.col}"))

if the values are numeric and the column names are not easily found, you can use mutate(across(where(is.numeric), ~.x - lead(.x), .names = "diff_{.col}")).

# A tibble: 8 × 7
# Groups:   Date [3]
  Date   Value1 Value2 Value3 diff_Value1 diff_Value2 diff_Value3
  <chr>   <int>  <int>  <int>       <int>       <int>       <int>
1 1997Q1    100    130    120           0           0          -4
2 1997Q1    100    130    124         -20          -6         -30
3 1997Q1    120    136    154          NA          NA          NA
4 1997Q2    180    145    154          -6          11          28
5 1997Q2    186    134    126           0          10         -50
6 1997Q2    186    124    176          NA          NA          NA
7 1997Q3    190    143    176          -2           0          53
8 1997Q3    192    143    123          NA          NA          NA

data:

df1 <- structure(list(Date = c("1997Q1", "1997Q1", "1997Q1", "1997Q2", 
"1997Q2", "1997Q2", "1997Q3", "1997Q3"), Value1 = c(100L, 100L, 
120L, 180L, 186L, 186L, 190L, 192L), Value2 = c(130L, 130L, 136L, 
145L, 134L, 124L, 143L, 143L), Value3 = c(120L, 124L, 154L, 154L, 
126L, 176L, 176L, 123L)), class = "data.frame", row.names = c(NA, 
-8L))
  • Related