(R) How to remove all rows that have a NULL value in a specified column?-CodePudding

I am trying to remove every row that has a value of NULL for the lyrics column from my data frame tsLyrics. I have tried

tsLyrics <- filter(tsLyrics, lyrics == NULL)

However, I get the following error:

Error in `filter()`:
! Problem while computing `..1 = lyrics == NULL`.
x Input `..1` must be of size 338 or 1, not size 0.

I have also tried changing my syntax to:

tsLyrics <- filter(tsLyrics, is.null(lyrics))

However, when I do this I get an empty data frame. How should I approach removing these NULLs?

In case it is applicable, each entry in the lyrics column is either a list or a NULL

Data Example

structure(list(track_name = c("Run (feat. Ed Sheeran) (Taylor’s Version) (From The Vault)", 
"The Very First Night (Taylor's Version) (From The Vault)", "All Too Well (10 Minute Version) (Taylor's Version) (From The Vault)", 
"State Of Grace (Taylor's Version)", "Red (Taylor's Version)"
), lyrics = list(structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), NULL)), row.names = c(NA, 
-5L), class = c("tbl_df", "tbl", "data.frame"))

CodePudding user response：

You can use sapply to map over the column with the is.null function

df <- structure(list(track_name = c("Run (feat. Ed Sheeran) (Taylor’s Version) (From The Vault)", 
"The Very First Night (Taylor's Version) (From The Vault)", "All Too Well (10 Minute Version) (Taylor's Version) (From The Vault)", 
"State Of Grace (Taylor's Version)", "Red (Taylor's Version)"
), lyrics = list(structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(), .Names = character(0), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), NULL)), row.names = c(NA, 
-5L), class = c("tbl_df", "tbl", "data.frame"))

library(dplyr, warn.conflicts = FALSE)
df
#> # A tibble: 5 × 2
#>   track_name                                                           lyrics  
#>   <chr>                                                                <list>  
#> 1 Run (feat. Ed Sheeran) (Taylor’s Version) (From The Vault)           <tibble>
#> 2 The Very First Night (Taylor's Version) (From The Vault)             <tibble>
#> 3 All Too Well (10 Minute Version) (Taylor's Version) (From The Vault) <tibble>
#> 4 State Of Grace (Taylor's Version)                                    <tibble>
#> 5 Red (Taylor's Version)                                               <NULL>
df %>% 
  filter(!sapply(lyrics, is.null))
#> # A tibble: 4 × 2
#>   track_name                                                           lyrics  
#>   <chr>                                                                <list>  
#> 1 Run (feat. Ed Sheeran) (Taylor’s Version) (From The Vault)           <tibble>
#> 2 The Very First Night (Taylor's Version) (From The Vault)             <tibble>
#> 3 All Too Well (10 Minute Version) (Taylor's Version) (From The Vault) <tibble>
#> 4 State Of Grace (Taylor's Version)                                    <tibble>

^{Created on 2022-05-04 by the reprex package (v2.0.1)}