Home > Software design >  Mutate multiple columns
Mutate multiple columns

Time:06-17

I am trying to calculated the difference between the first and the last value for the each Group for the each columne (Value1, Value2, Value3 etc).

Group    Dates      Value1 Value2 Value3
1      2000-01-01     NA      0      5
1      2000-02-01      1      0     10
1      2000-03-01      2      1      0
2      2000-04-01      4      1     NA
2      2000-05-01      1      2     NA
2      2000-06-01      2      2     40

For example: diff_Value1=-1 because the first value for the Group 1 is 1 and last value is 2.

I am using below code. How can I extend for 30 more columns (e.g. Value1...-Value 30)? Do I need to use loop inside mutate function?

df <- data.frame(Group=c(rep(1,3), rep(2,3)),
                    Dates=seq(as.Date("2000/1/1"), by = "month", length.out = 6),
                    Value1=c(NA, 1:2,4,1:2),
                    Value2=c(0,0,1,1,2,2),
                    Value3=c(5,10,0,NA,NA,40)
)

df %>% 
  group_by(Group) %>%
  dplyr::mutate(
    Value1_diff = dplyr::first(na.omit(Value1))-dplyr::last(na.omit(Value1)),
    Value2_diff = dplyr::first(na.omit(Value2))-dplyr::last(na.omit(Value2)),
    Value3_diff = dplyr::first(na.omit(Value3))-dplyr::last(na.omit(Value3))
  )
 Group    Dates      Value1 Value2 Value3 Value1_diff Value2_diff Value3_diff
 1     2000-01-01     NA      0      5          -1          -1           5
 1     2000-02-01      1      0     10          -1          -1           5
 1     2000-03-01      2      1      0          -1          -1           5
 2     2000-04-01      4      1     NA           2          -1           0
 2     2000-05-01      1      2     NA           2          -1           0
 2     2000-06-01      2      2     40           2          -1           0

CodePudding user response:

We may use across to loop over multiple columns

library(dplyr)
df <- df %>%
   group_by(Group) %>%
    mutate(across(starts_with('Value'),  
    ~ first(na.omit(.)) - last(na.omit(.)), .names = "{.col}_diff")) %>% 
    ungroup

-output

df
# A tibble: 6 × 8
  Group Dates      Value1 Value2 Value3 Value1_diff Value2_diff Value3_diff
  <dbl> <date>      <dbl>  <dbl>  <dbl>       <dbl>       <dbl>       <dbl>
1     1 2000-01-01     NA      0      5          -1          -1           5
2     1 2000-02-01      1      0     10          -1          -1           5
3     1 2000-03-01      2      1      0          -1          -1           5
4     2 2000-04-01      4      1     NA           2          -1           0
5     2 2000-05-01      1      2     NA           2          -1           0
6     2 2000-06-01      2      2     40           2          -1           0
  •  Tags:  
  • r
  • Related