Home > Mobile >  output row names containing NA in atleast one data.frame
output row names containing NA in atleast one data.frame

Time:10-27

I have two dataframes:

df1:

        c1-1       c2-45      c3-65      c4-88         c5-97
r1        NA    0.598817857 0.053798422 0.776829475      NA
r2  0.481191121 0.67205121  0.50231424  0.933501988 0.169838618
r3  0.127680486       NA    0.188186772       NA    0.410198769
r4  0.448870194 0.372560979 0.627946034 0.277422856 0.540501786
r5  0.828152448 0.962372344 0.72686092  0.881644452 0.822969723

df2:

          c1-24     c2-98     c3-77        c4-82       c5-9
r1  0.528260595 0.602697657 0.15193253  0.458712206 0.785602995
r2  0.250479754 0.999715659 0.575051699     NA      0.830962509
r3     NA           NA      0.733031402 0.189934875 0.554902551
r4  0.160801532 0.611729999 0.665725625 0.966146299 0.005503371
r5  0.483603251 0.306977032 0.377184726 0.109827232 0.63159439

both of them contain the same row names, but contain different column names (the string before the '-' symbol is the same for both dataframes but the string after is different).

I would like to compare the two dataframes and output rows that contain NA in atleast one of them. for example: the output in the above example would be:

r1, r2, r3

CodePudding user response:

  • is.na to check for NA values
  • Use | to get TRUE if either df1 or df2 has NA.
  • rowSums to count NA values in the row
  • return rownames of only those rows that have more than 0 NA values.
rownames(df1)[rowSums(is.na(df1) | is.na(df2)) > 0]
#[1] "r1" "r2" "r3"

CodePudding user response:

We can use

row.names(df1)[Reduce(`|`, lapply(df1 * df2, is.na))]

-output

[1] "r1" "r2" "r3"
  • Related