This is my code
col_vec <- vector(length = 11)
my_df <- mtcars
for (i in 1:11) {
col_vec[i]<- glue::glue("col{i}")
colnames(my_df) <- col_vec
}
I want to do this bellow for all columns but I would like to use glue package:
my_df %>% mutate(up_down = as.numeric(col1) - lag(as.numeric(col1),
default = as.numeric(col1)[1]))
I tried this, but it didnt work:
for (i in 1:11) {
my_df %>% mutate(glue::glue("up_down_{i} = as.numeric(col{i}) - lag(as.numeric(col{i}),
default = as.numeric(col{i})[1]))"))
}
The output should give me for each col_*
a column that calculate the difference between rows.
So, I would have 22 columns as a result
CodePudding user response:
The below answer should work,
Basically, we use mutate()
and across()
to create new columns, that take a lag difference of a column and apply that across to each numeric column. We then use the .names
argument in the across to give each new lag column a new name.
my_df %>%
mutate(across(where(is.numeric),list(lag=~.x-lag(.x)),.names= "{.col}_{.fn}"))
CodePudding user response:
We can use across
more easily without having to assign with glue
on the lhs
. The new columns are automatically added once we modify the .names
library(dplyr)
my_df2 <- my_df %>%
mutate(across(starts_with('col'), ~ .x -
lag(.x, default = first(.x)), .names = "up_down_{.col}"))
-output
> head(my_df2)
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 up_down_col1 up_down_col2 up_down_col3 up_down_col4 up_down_col5
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 0.0 0 0 0 0.00
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 0.0 0 0 0 0.00
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1.8 -2 -52 -17 -0.05
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 -1.4 2 150 17 -0.77
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 -2.7 2 102 65 0.07
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 -0.6 -2 -135 -70 -0.39
up_down_col6 up_down_col7 up_down_col8 up_down_col9 up_down_col10 up_down_col11
Mazda RX4 0.000 0.00 0 0 0 0
Mazda RX4 Wag 0.255 0.56 0 0 0 0
Datsun 710 -0.555 1.59 1 0 0 -3
Hornet 4 Drive 0.895 0.83 0 -1 -1 0
Hornet Sportabout 0.225 -2.42 -1 0 0 1
Valiant 0.020 3.20 1 0 0 -1
If we want to use a for
loop, it would need assignment (<-
) to original dataset for creating new columns. In addition, we need to evaluate (!!
) while assigning with :=
) (instead of eval/parse
)
my_df3 <- my_df
for(i in 1:11) {
my_df3 <- my_df3 %>%
mutate(!! glue::glue("up_down_{i}") :=
!! rlang::sym(glue::glue("col{i}")) -
lag(!!rlang::sym(glue::glue("col{i}")), default =
first(!!rlang::sym(glue::glue("col{i}")))))
}
-output
> head(my_df3)
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 up_down_1 up_down_2 up_down_3 up_down_4 up_down_5 up_down_6 up_down_7
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 0.0 0 0 0 0.00 0.000 0.00
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 0.0 0 0 0 0.00 0.255 0.56
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1.8 -2 -52 -17 -0.05 -0.555 1.59
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 -1.4 2 150 17 -0.77 0.895 0.83
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 -2.7 2 102 65 0.07 0.225 -2.42
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 -0.6 -2 -135 -70 -0.39 0.020 3.20
up_down_8 up_down_9 up_down_10 up_down_11
Mazda RX4 0 0 0 0
Mazda RX4 Wag 0 0 0 0
Datsun 710 1 0 0 -3
Hornet 4 Drive 0 -1 -1 0
Hornet Sportabout -1 0 0 1
Valiant 1 0 0 -1
> dim(my_df3)
[1] 32 22