I have this dataframe:
df <- structure(list(A = c(2L, 3L, 4L, 5L, 5L), B = c(3L, 1L, 2L, 5L,
5L), C = c(4L, 5L, 2L, 1L, 1L), D = c(3L, 1L, 5L, 1L, 2L)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -5L))
A B C D
<int> <int> <int> <int>
1 2 3 4 3
2 3 1 5 1
3 4 2 2 5
4 5 5 1 1
5 5 5 1 2
I want to substract each column by its following column!
I can do this with this base R code from here Subtract a column in a dataframe from many columns in R:
df[-1] - df[-ncol(df)]
B C D
1 1 1 -1
2 -2 4 -4
3 -2 0 3
4 0 -4 0
5 0 -4 1
I would like to use across
because of the .names
argument and thus transform this code to dplyr
Expected Output:
A B C D B-A C-B D-C
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
My first try:
library(dplyr)
df %>%
mutate(across(everything(), ~df[-1] - df[-ncol(df)], .names = "{.col}-{.col}")) %>%
select(contains("-"))
`A-A`$B $C $D `B-B`$B $C $D `C-C`$B $C $D `D-D`$B $C $D
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 1 1 -1 1 1 -1 1 1 -1 1 1 -1
2 -2 4 -4 -2 4 -4 -2 4 -4 -2 4 -4
3 -2 0 3 -2 0 3 -2 0 3 -2 0 3
4 0 -4 0 0 -4 0 0 -4 0 0 -4 0
5 0 -4 1 0 -4 1 0 -4 1 0 -4 1
My second try:
df %>%
mutate(across(everything(), ~.[-1] - .[-ncol(.)], .names = "{.col}-{.col}"))
Error in `mutate()`:
! Problem while computing
`..1 = across(everything(), ~.[-1]
- .[-ncol(.)], .names =
"{.col}-{.col}")`.
Caused by error in `across()`:
! Problem while computing
column `A-A`.
Caused by error in `-ncol(A)`:
! invalid argument to unary operator
Run `rlang::last_error()` to see where the error occurred.
CodePudding user response:
There are easier ways, but if we want across
library(dplyr)
df %>%
mutate(across(-1, ~ {
prevnm <- names(cur_data())[match(cur_column(), names(cur_data()))-1]
.x - df[[prevnm]]},
.names = "{.col}-{names(.)[match(.col, names(.))-1]}"))
-output
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
Or use two across
df %>%
mutate(across(-1, .names = "{.col}-{names(.)[match(.col,
names(.))-1]}") - across(-last_col()))
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
Also, there is a more compact option with across2
from dplyover
library(dplyover) #https://github.com/TimTeaFan/dplyover
df %>%
mutate(across2(-1, -last_col(), ~.x -.y, .names = "{xcol}-{ycol}"))
# A tibble: 5 × 7
A B C D `B-A` `C-B` `D-C`
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1
If .names
can use the default underscore as separator, then it is more easier
df %>%
mutate(across2(-1, -last_col(), `-`))
# A tibble: 5 × 7
A B C D B_A C_B D_C
<int> <int> <int> <int> <int> <int> <int>
1 2 3 4 3 1 1 -1
2 3 1 5 1 -2 4 -4
3 4 2 2 5 -2 0 3
4 5 5 1 1 0 -4 0
5 5 5 1 2 0 -4 1