I have a data frame in R, say a1 here for a toy example, and I have a row subset from it, say a2. I want to find the original subset index (2,4).
I tried which or match, but did not succeed.
set.seed(123)
a1=data.frame(x1=rnorm(5),x2=runif(5),x3=runif(5))
a2=a1[c(2,4),]
a2index=rep(NA,dim(a2)[1])
Here is my a1 data.frame
a1
x1 x2 x3
1 -0.56047565 0.9568333 0.89982497
2 -0.23017749 0.4533342 0.24608773
3 1.55870831 0.6775706 0.04205953
4 0.07050839 0.5726334 0.32792072
5 0.12928774 0.1029247 0.95450365
a2 is a row subset of a1:
a2
x1 x2 x3
2 -0.23017749 0.4533342 0.2460877
4 0.07050839 0.5726334 0.3279207
I managed to obtain the index using double loop. But it is too slow, is there a way to speed it up?
Thanks for help.
for (i in 1:dim(a2)[1] )
for (j in 1:dim(a1)[1])
if (all(a2[i,]==a1[j,])){
a2index[i]=j
break;
}
# return the index vector (2,4)
a2index``
CodePudding user response:
You can use match
after transposing your data frames, so that it can have column-wise comparison.
match(as.data.frame(t(a2)), as.data.frame(t(a1)))
[1] 2 4