Sweeping matrix by second matrix by group-CodePudding

I have a problem and I am unable to find similar results (also because I am not sure how to word the search itself). I have a huge matrix with multiple columns. This matrix also has a column that specifies the "identity" of each line. I also have a second matrix with thresholds for each column for each identity. I can simplify the problem with the following code (although, take in mind that I have many many columns).

temp_df=data.frame(a_name=rep(c("A","B", "C"), 20), matrix(rnorm(40),nrow=60, ncol=2))

That contains:

> head(temp_df)
  a_name          X1         X2
1      A  0.31469191 -0.2763107
2      B -1.17477425 -0.2066650
3      C  0.10651550  0.1581325
4      A -1.88258477 -0.9280463
5      B -2.58260181  1.0322196
6      C  0.03027953  0.3110290

And the threshold matrix:

temp_limits=data.frame(a_name=c("A", "B", "C"), X1=c(1, 0.5, 0), x2=c(-1, -0.5, -0.25))

That contains:

> temp_limits
  a_name  X1    x2
1      A 1.0 -1.00
2      B 0.5 -0.50
3      C 0.0 -0.25

I could process the matrix as follow to obtain what I want:

res_df=NULL
for(i in unique(temp_df$a_name)){
  tdf=temp_df[temp_df$a_name==i,]
  a_lim=temp_limits[temp_limits$a_name==i,]
  tdf[, 2:3]=sweep(as.matrix(tdf[, 2:3]), MARGIN=2, a_lim[, 2:3], FUN=">" )
  res_df=rbind(res_df, tdf)
}

That gives what I want:

> head(res_df)
   a_name    X1    X2
1       A FALSE  TRUE
4       A FALSE  TRUE
7       A FALSE  TRUE
10      A FALSE  TRUE
13      A FALSE FALSE
16      A FALSE  TRUE

But I believe this implementation is too long, cumbersome, and not optimal. I am thinking I should use group_by, by I am unsure on how to proceed. Any help?

CodePudding user response：

You can try this

cbind(temp_df[1], temp_df[-1] > temp_limits[match(temp_df$a_name, temp_limits$a_name), -1])

CodePudding user response：

We may use a join here

library(data.table)
setDT(temp_df)[temp_limits, .(a_name, X1 = X1 > i.X1, X2 = X2 > i.x2),
    on = .(a_name)]