Despite using R and dplyr on a regular basis, I encountered the issue of not being able to calculate the sum of the absolute differences between all columns:
sum_diff=ABS(A-B) ABS(B-C) ABS(C-D)...
A | B | C | D | sum_diff |
---|---|---|---|---|
1 | 2 | 3 | 4 | 3 |
2 | 1 | 3 | 4 | 4 |
1 | 2 | 1 | 1 | 2 |
4 | 1 | 2 | 1 | 5 |
I know I could iterate using a for loop over all columns, but given the size of my data frame, I prefer a more elegant and fast solution.
Any help?
Thank you
CodePudding user response:
We may remove the first and last columns, get the difference, and use rowSums
on the abs
olute values in base R
. This could be very efficient compared to a package solution
df1$sum_diff <- rowSums(abs(df1[-ncol(df1)] - df1[-1]))
-output
> df1
A B C D sum_diff
1 1 2 3 4 3
2 2 1 3 4 4
3 1 2 1 1 2
4 4 1 2 1 5
Or another option is rowDiffs
from matrixStats
library(matrixStats)
rowSums(abs(rowDiffs(as.matrix(df1))))
[1] 3 4 2 5
data
df1 <- structure(list(A = c(1L, 2L, 1L, 4L), B = c(2L, 1L, 2L, 1L),
C = c(3L, 3L, 1L, 2L), D = c(4L, 4L, 1L, 1L)), row.names = c(NA,
-4L), class = "data.frame")
CodePudding user response:
Daata from akrun (many thanks)!
This is complicated the idea is to generate a list of the combinations, I tried it with combn
but then I get all possible combinations. So I created by hand.
With this combinations we then could use purrr
s map_dfc
and do some data wrangling after that:
library(tidyverse)
combinations <-list(c("A", "B"), c("B", "C"), c("C","D"))
purrr::map_dfc(combinations, ~{df <- tibble(a=data[[.[[1]]]]-data[[.[[2]]]])
names(df) <- paste0(.[[1]],"_v_",.[[2]])
df}) %>%
transmute(sum_diff = rowSums(abs(.))) %>%
bind_cols(data)
sum_diff A B C D
<dbl> <int> <int> <int> <int>
1 3 1 2 3 4
2 4 2 1 3 4
3 2 1 2 1 1
4 5 4 1 2 1
data:
data <- structure(list(A = c(1L, 2L, 1L, 4L), B = c(2L, 1L, 2L, 1L),
C = c(3L, 3L, 1L, 2L), D = c(4L, 4L, 1L, 1L)), row.names = c(NA,
-4L), class = "data.frame")