I have two data.frame()
with same number of rows but unequal number of columns:
df1 = data.frame(v1=c(NA,NA,2),v2=c(4,3,NA),v3=c(1,2,2))
df2 = data.frame(v4=c(NA,NA,NA), v5=c(NA,4,2))
> df1
v1 v2 v3
1 NA 4 1
2 NA 3 2
3 2 NA 2
> df2
v4 v5
1 NA NA
2 NA 4
3 NA 2
I need to check row-wise if there is any value in row x of df2 which is not present in the corresponding row in df1. I also would like to ignore NA
.
Result should be
0
1
0
since value 4
(in df2
, row 2) is not present in row 2 of df1
(where 3
, 2
are present).
In the moment I use:
for (row in seq(1:nrow(df1))){
print(as.numeric(FALSE %in% (df2[row,] %in% df1[row,])))
}
However, this seems to be a bad solution wrt to performance.
CodePudding user response:
We may use Map/mapply
to loop over the list
s after splitting the datasets by row (asplit
- MARGIN = 1
), then use %in%
to create a logical and coerce it to binary (
)
(mapply(\(x, y) any(!y[complete.cases(y)] %in%
x[complete.cases(x)]), asplit(df1, 1), asplit(df2, 1)))
[1] 0 1 0
CodePudding user response:
for (row in 1:nrow(df1)){
print(as.numeric(any(!df2[row,] %in% df1[row,], na.rm = T)))
}