Home > Back-end >  statement to skip empty data frames in a function
statement to skip empty data frames in a function

Time:11-16

I have a function similar to this in R, which uses a df as main argument. However, the df could be empty and I would like to skip those if I run the function using lapply e.g lapply(FileNames, function(y) my.function(y))

my function looks like this

my.function <- function(table) {
file <- read.csv(table, header=T)

  if(nrow(table) == 0) 
   {next}

#skip the df if is empty and stop processing that df
#do more stuffs if not empty
}

I've tried this so far but if the data frame is empty the whole process crashes, any tip?

CodePudding user response:

Suppose you have a list of data frames like this:

my_list <- list(data.frame(a = 1:3, b = 1:3),
                data.frame(a = c(), b = c()),
                data.frame(a = 1:5, b = 1:5))

my_list
#> [[1]]
#>   a b
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 
#> [[2]]
#> data frame with 0 columns and 0 rows
#> 
#> [[3]]
#>   a b
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 4 4
#> 5 5 5

Then your function could be something like this:

my_func <- function(df) {

  # return NULL and stop function if dataframe has zero rows
  if(nrow(df) == 0) return(NULL)

  # Do work on dataframes with one or more rows
  data.frame(sum = df$a   df$b)
}

So now if we lapply we get:

result <- lapply(my_list, my_func)

result
#> [[1]]
#>   sum
#> 1   2
#> 2   4
#> 3   6
#> 
#> [[2]]
#> NULL
#> 
#> [[3]]
#>   sum
#> 1   2
#> 2   4
#> 3   6
#> 4   8
#> 5  10

And if you want to remove the NULL values from your list, you can do:

result[!sapply(result, is.null)]
#> [[1]]
#>   sum
#> 1   2
#> 2   4
#> 3   6
#> 
#> [[2]]
#>   sum
#> 1   2
#> 2   4
#> 3   6
#> 4   8
#> 5  10

Created on 2021-11-15 by the reprex package (v2.0.0)

  • Related