I have two dataframes:
df1:
c1-1 c2-45 c3-65 c4-88 c5-97
r1 NA 0.598817857 0.053798422 0.776829475 NA
r2 0.481191121 0.67205121 0.50231424 0.933501988 0.169838618
r3 0.127680486 NA 0.188186772 NA 0.410198769
r4 0.448870194 0.372560979 0.627946034 0.277422856 0.540501786
r5 0.828152448 0.962372344 0.72686092 0.881644452 0.822969723
df2:
c1-24 c2-98 c3-77 c4-82 c5-9
r1 0.528260595 0.602697657 0.15193253 0.458712206 0.785602995
r2 0.250479754 0.999715659 0.575051699 NA 0.830962509
r3 NA NA 0.733031402 0.189934875 0.554902551
r4 0.160801532 0.611729999 0.665725625 0.966146299 0.005503371
r5 0.483603251 0.306977032 0.377184726 0.109827232 0.63159439
both of them contain the same row names, but contain different column names (the string before the '-' symbol is the same for both dataframes but the string after is different).
I would like to compare the two dataframes and output rows that contain NA in atleast one of them. for example: the output in the above example would be:
r1, r2, r3
CodePudding user response:
is.na
to check forNA
values- Use
|
to getTRUE
if eitherdf1
ordf2
hasNA
. rowSums
to countNA
values in the row- return rownames of only those rows that have more than 0
NA
values.
rownames(df1)[rowSums(is.na(df1) | is.na(df2)) > 0]
#[1] "r1" "r2" "r3"
CodePudding user response:
We can use
row.names(df1)[Reduce(`|`, lapply(df1 * df2, is.na))]
-output
[1] "r1" "r2" "r3"