Home > Mobile >  How to get the sum of differences in R dataframe
How to get the sum of differences in R dataframe

Time:04-29

i am wondering if anyone knows of any functions/tactics for calculating this problem i have:

so if I have the variable x

x <- c(1,2,3,4,5,6,7,8,9) 
dataframe <- data.frame(x)
dataframe$y <- 0

and I want to calculate the sum of the differences between each value and the rest of the values. Then each value in this variable will have a score of the difference. Something along the lines of: y[1]<-sum((x[1]-2) (x[1]-3) (x[1]-4)....(x[1]-9))

Something along the lines of this (since in reality I have a big data frame with many conditions i want to run this calculation over)

difference_sum <- 0

for (i in x) {
  value_of_interest <- x[i] 
  difference_sum <- difference_sum    (x[i] - x[i 1])
  difference_sum <- difference_sum    (x[i] - x[i 2])
} 
#all the way through the end of the list

thanks!

CodePudding user response:

Not sure if these are the values you're looking for, but perhaps

o <- outer(dataframe$x, dataframe$x, `-`)
o[lower.tri(o)] <- NA
dataframe$y <- rowSums(o, na.rm = TRUE)
dataframe
#   x   y
# 1 1 -36
# 2 2 -28
# 3 3 -21
# 4 4 -15
# 5 5 -10
# 6 6  -6
# 7 7  -3
# 8 8  -1
# 9 9   0

CodePudding user response:

You can use imap() from purrr.

library(dplyr)
library(purrr)

df %>%
  mutate(y = imap_dbl(x, ~ sum(.x - x[-(1:.y)])))

#   x   y
# 1 1 -36
# 2 2 -28
# 3 3 -21
# 4 4 -15
# 5 5 -10
# 6 6  -6
# 7 7  -3
# 8 8  -1
# 9 9   0

The method above may break in case x is a named vector. A safer way is passing the indices 1:n() into map2():

df %>%
  mutate(y = map2_dbl(x, 1:n(), ~ sum(.x - x[-(1:.y)])))

  • Related