Home > Blockchain >  Writing a summation formula using variables from multiple observations
Writing a summation formula using variables from multiple observations

Time:08-19

I am trying to create a new variable for each observation using the following formula:

Index = ∑(BAj / DISTANCEij)

where: j = focal observation; i= other observation

Basically, I'm taking the focal individual (i) and finding the euclidean distance between it and another point and dividing the other points BA by that distance. Do that for all the other points and then sum them all and repeat all of this for each point.

Here is some sample data:

ID <- 1:4
BA <- c(3, 5, 6, 9)
x <- c(0, 2, 3, 7)
y <- c(1, 3, 4, 9)
df <- data.frame(ID, BA, x, y)
print(df)

  ID BA x y
1  1  3 0 1
2  2  5 2 3
3  3  6 3 4
4  4  9 7 9

Currently, I've extracted out vectors and created a formula to calculate part of the formula shown here:

vec1 <- df[1, ]
vec2 <- df[2, ]

dist <- function(vec1, vec2) vec1$BA/sqrt((vec2$x - vec1$x)^2   
                                                (vec2$y - vec1$y)^2)

My question is how do I repeat this with the x and y values for vec2 changing for each new other point with vec1 remaining the same and then sum them all together?

CodePudding user response:

We may loop over the row sequence, extract the data and apply the dist function

library(dplyr)
library(purrr)
df %>% 
  mutate(dist_out = map_dbl(row_number(), ~ {
        othr <- cur_data()[-.x,]
        cur <- cur_data()[.x, ]
     sum(dist(cur, othr))
  }))

-output

 ID BA x y dist_out
1  1  3 0 1 2.049983
2  2  5 2 3 5.943485
3  3  6 3 4 6.593897
4  4  9 7 9 3.404545

CodePudding user response:

Here are two base R ways.

1. for loop

ID <- 1:4
BA <- c(3, 5, 6, 9)
x <- c(0, 2, 3, 7)
y <- c(1, 3, 4, 9)
df <- data.frame(ID, BA, x, y)

n <- nrow(df)
d <- dist(df[c("x", "y")], upper = TRUE)
d <- as.matrix(d)
Index <- numeric(n)
for(j in seq_len(n)) {
  d_j <- d[-j, j, drop = TRUE]
  Index[j] <- sum(df$BA[j]/d_j)
}
Index
#> [1] 2.049983 5.943485 6.593897 3.404545

Created on 2022-08-18 by the reprex package (v2.0.1)

2. sapply loop

Index <- sapply(seq_len(n), \(j) sum(df$BA[j]/d[-j, j, drop = TRUE]))
Index
#> [1] 2.049983 5.943485 6.593897 3.404545

Created on 2022-08-18 by the reprex package (v2.0.1)

  • Related