I am trying to calculated the difference between the first and the last value for the each Group for the each columne (Value1, Value2, Value3 etc).
Group Dates Value1 Value2 Value3
1 2000-01-01 NA 0 5
1 2000-02-01 1 0 10
1 2000-03-01 2 1 0
2 2000-04-01 4 1 NA
2 2000-05-01 1 2 NA
2 2000-06-01 2 2 40
For example: diff_Value1=-1 because the first value for the Group 1 is 1 and last value is 2.
I am using below code. How can I extend for 30 more columns (e.g. Value1...-Value 30)? Do I need to use loop inside mutate function?
df <- data.frame(Group=c(rep(1,3), rep(2,3)),
Dates=seq(as.Date("2000/1/1"), by = "month", length.out = 6),
Value1=c(NA, 1:2,4,1:2),
Value2=c(0,0,1,1,2,2),
Value3=c(5,10,0,NA,NA,40)
)
df %>%
group_by(Group) %>%
dplyr::mutate(
Value1_diff = dplyr::first(na.omit(Value1))-dplyr::last(na.omit(Value1)),
Value2_diff = dplyr::first(na.omit(Value2))-dplyr::last(na.omit(Value2)),
Value3_diff = dplyr::first(na.omit(Value3))-dplyr::last(na.omit(Value3))
)
Group Dates Value1 Value2 Value3 Value1_diff Value2_diff Value3_diff
1 2000-01-01 NA 0 5 -1 -1 5
1 2000-02-01 1 0 10 -1 -1 5
1 2000-03-01 2 1 0 -1 -1 5
2 2000-04-01 4 1 NA 2 -1 0
2 2000-05-01 1 2 NA 2 -1 0
2 2000-06-01 2 2 40 2 -1 0
CodePudding user response:
We may use across
to loop over multiple columns
library(dplyr)
df <- df %>%
group_by(Group) %>%
mutate(across(starts_with('Value'),
~ first(na.omit(.)) - last(na.omit(.)), .names = "{.col}_diff")) %>%
ungroup
-output
df
# A tibble: 6 × 8
Group Dates Value1 Value2 Value3 Value1_diff Value2_diff Value3_diff
<dbl> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2000-01-01 NA 0 5 -1 -1 5
2 1 2000-02-01 1 0 10 -1 -1 5
3 1 2000-03-01 2 1 0 -1 -1 5
4 2 2000-04-01 4 1 NA 2 -1 0
5 2 2000-05-01 1 2 NA 2 -1 0
6 2 2000-06-01 2 2 40 2 -1 0