Home > Software design >  Pairwise subtraction of two numeric vectors and selection of results within a given range
Pairwise subtraction of two numeric vectors and selection of results within a given range

Time:07-09

I have two dataframes of the form

df.f <- data.frame(ID=c("A_1", "A_2", "A_3"), B=c(130, 250, 330))
df.r <- data.frame(ID=c("B_1", "B_2", "B_3"), B=c(170, 350, 480))

I want to pairwise subtract df.f$B from df.r$B (direction is important), and selects those pairs which subtraction value is within a range of 200 and 300.

I know how to outer:

    outer(df.r$B, df.f$B, '-')

     [,1] [,2] [,3]
[1,]   40  -80 -160
[2,]  220  100   20
[3,]  350  230  150

and could potentially melt the resulting matrix and then select upon the value column, but outer doesnt keep names:

library(reshape2)
res <- melt(as.matrix(outer(df.r$B, df.f$B, '-')))
final <- res[res$value >= 200 & res$value <= 300,]

> final
  Var1 Var2 value
2    2    1   220
6    3    2   230

Any idea how to achieve a result in the form of:

Var1 Var2 Diff
A_1 B_2 220
A_2 B_3 230

I guess it can be done with finding the indices in the original table.

Many thanks!

CodePudding user response:

You can assign names after outer:

o <- outer(df.r$B, df.f$B, '-')
dimnames(o) <- list(df.r$ID, df.f$ID)
o
#     A_1 A_2  A_3
# B_1  40 -80 -160
# B_2 220 100   20
# B_3 350 230  150

or to each vector:

o <- outer(setNames(df.r$B, df.r$ID), setNames(df.f$B, df.f$ID), '-')
o
#     A_1 A_2  A_3
# B_1  40 -80 -160
# B_2 220 100   20
# B_3 350 230  150

Which melts well:

res <- reshape2::melt(o)
res
#   Var1 Var2 value
# 1  B_1  A_1    40
# 2  B_2  A_1   220
# 3  B_3  A_1   350
# 4  B_1  A_2   -80
# 5  B_2  A_2   100
# 6  B_3  A_2   230
# 7  B_1  A_3  -160
# 8  B_2  A_3    20
# 9  B_3  A_3   150

res[res$value >= 200 & res$value <= 300,]
#   Var1 Var2 value
# 2  B_2  A_1   220
# 6  B_3  A_2   230

Alternatively, you can do an outer-join (merge on nothing) and some simple assignment:

with(merge(df.f, df.r, by = c()),
     data.frame(Var1=ID.x, Var2=ID.y, Diff=B.y-B.x))
#   Var1 Var2 Diff
# 1  A_1  B_1   40
# 2  A_2  B_1  -80
# 3  A_3  B_1 -160
# 4  A_1  B_2  220
# 5  A_2  B_2  100
# 6  A_3  B_2   20
# 7  A_1  B_3  350
# 8  A_2  B_3  230
# 9  A_3  B_3  150

(and subset as desired).

  •  Tags:  
  • r
  • Related