Home > Enterprise >  R: Check if any value in df row is not present in same row in other df
R: Check if any value in df row is not present in same row in other df

Time:05-31

I have two data.frame() with same number of rows but unequal number of columns:

df1 = data.frame(v1=c(NA,NA,2),v2=c(4,3,NA),v3=c(1,2,2))
df2 = data.frame(v4=c(NA,NA,NA), v5=c(NA,4,2))

> df1
  v1 v2 v3
1 NA  4  1
2 NA  3  2
3  2 NA  2

> df2
  v4 v5
1 NA NA
2 NA  4
3 NA  2

I need to check row-wise if there is any value in row x of df2 which is not present in the corresponding row in df1. I also would like to ignore NA.

Result should be

0
1
0

since value 4 (in df2, row 2) is not present in row 2 of df1 (where 3, 2 are present).

In the moment I use:

for (row in seq(1:nrow(df1))){
  print(as.numeric(FALSE %in% (df2[row,] %in% df1[row,])))
}

However, this seems to be a bad solution wrt to performance.

CodePudding user response:

We may use Map/mapply to loop over the lists after splitting the datasets by row (asplit - MARGIN = 1), then use %in% to create a logical and coerce it to binary ( )

 (mapply(\(x, y) any(!y[complete.cases(y)] %in% 
    x[complete.cases(x)]),  asplit(df1, 1), asplit(df2, 1)))
[1] 0 1 0

CodePudding user response:

for (row in 1:nrow(df1)){
  print(as.numeric(any(!df2[row,] %in% df1[row,], na.rm = T)))
}
  •  Tags:  
  • r
  • Related