Return name of empty columns from a list-CodePudding

I have a named list of data frames that all contain the same columns, but for some of these data frames some of these columns are empty. What Im hoping to return is the name of the data frame in the list, and the name(s) of the empty column.

The repex below mirrors the process I am using on the full problem

library(tidyverse)

data("diamonds") 

data1 <- diamonds 

data1$color <- NA

data1$price <- NA

data2 <- diamonds

data2$carat <- NA

data1$Type <- "data1"

data2$Type <- "data2"

data1%>%
  bind_rows(data2) -> dataFull

dataSplit <- split(dataFull, f = dataFull$Type)

for(i in dataSplit){
  
  which(sapply(dataSplit[[i]], function(x) all(is.na(x))))
  
}

My hope is to return something like

data1: price, color

data2: carat

I've tried the very basic for-loop included above, which are admittedly not my strong suit.

CodePudding user response：

Your sapply idea was right, but you need to subset the names of each data frame with the output. Also, since you are loading the tidyverse, you may as well use map instead of a loop for brevity:

map(dataSplit, ~ names(.x)[sapply(.x, \(x) all(is.na(x)))])
#> $data1
#> [1] "color" "price"
#> 
#> $data2
#> [1] "carat"

CodePudding user response：

library(tidyverse)

data("diamonds") 

data1 <- diamonds 

data1$color <- NA

data1$price <- NA

data2 <- diamonds

data2$carat <- NA

data1$Type <- "data1"

data2$Type <- "data2"

data1%>%
  bind_rows(data2) -> dataFull

dataSplit <- split(dataFull, f = dataFull$Type)

lapply(dataSplit, function(x) {
  cn <- colnames(x)
  isempty <- apply(x, 2, function(col) is.na(col) |> all())
  cn[ isempty ]
})

$data1
[1] "color" "price"

$data2
[1] "carat"

CodePudding user response：

Using select

library(dplyr)
library(purrr)
map(dataSplit, ~ .x %>% 
      select(where(~ all(is.na(.x)))) %>%
      names)
$data1
[1] "color" "price"

$data2
[1] "carat"

Or in base R

 lapply(dataSplit, \(x) names(x)[!colSums(!is.na(x))])
$data1
[1] "color" "price"

$data2
[1] "carat"