Create a column that tells which other columns have an NA value-CodePudding

I have data like these:


id   color    shape      animal
4     red      NA          NA
5     NA       square      dog
3     blue     NA          cat
2     green    circle      NA

I want to make a new column that tells me which other columns have missing data in them, like:

id   color    shape      animal     Missing
4     red      NA          NA       shape, animal
5     NA       square      dog      color
3     blue     NA          cat      shape
2     green    circle      NA       animal

CodePudding user response：

We may use apply rowwise to paste the column names that are NA

df1$Missing <- apply(df1[-1], 1, function(x) toString(names(x)[is.na(x)]))

CodePudding user response：

Here is a tidyverse approach. Main feature is using cur_column():

library(dplyr)
library(tidyr)

df %>% 
  mutate(across(-id, ~case_when(is.na(.) ~ cur_column()), .names = 'new_{col}')) %>%
  unite(New_Col, starts_with('new'), na.rm = TRUE, sep = ' ')

  id color  shape animal      New_Col
1  4   red   <NA>   <NA> shape animal
2  5  <NA> square    dog        color
3  3  blue   <NA>    cat        shape
4  2 green circle   <NA>       animal