I have a function similar to this in R
, which uses a df
as main argument. However, the df could be empty and I would like to skip those if I run the function using lapply
e.g lapply(FileNames, function(y) my.function(y))
my function looks like this
my.function <- function(table) {
file <- read.csv(table, header=T)
if(nrow(table) == 0)
{next}
#skip the df if is empty and stop processing that df
#do more stuffs if not empty
}
I've tried this so far but if the data frame is empty the whole process crashes, any tip?
CodePudding user response:
Suppose you have a list of data frames like this:
my_list <- list(data.frame(a = 1:3, b = 1:3),
data.frame(a = c(), b = c()),
data.frame(a = 1:5, b = 1:5))
my_list
#> [[1]]
#> a b
#> 1 1 1
#> 2 2 2
#> 3 3 3
#>
#> [[2]]
#> data frame with 0 columns and 0 rows
#>
#> [[3]]
#> a b
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 4 4
#> 5 5 5
Then your function could be something like this:
my_func <- function(df) {
# return NULL and stop function if dataframe has zero rows
if(nrow(df) == 0) return(NULL)
# Do work on dataframes with one or more rows
data.frame(sum = df$a df$b)
}
So now if we lapply
we get:
result <- lapply(my_list, my_func)
result
#> [[1]]
#> sum
#> 1 2
#> 2 4
#> 3 6
#>
#> [[2]]
#> NULL
#>
#> [[3]]
#> sum
#> 1 2
#> 2 4
#> 3 6
#> 4 8
#> 5 10
And if you want to remove the NULL
values from your list, you can do:
result[!sapply(result, is.null)]
#> [[1]]
#> sum
#> 1 2
#> 2 4
#> 3 6
#>
#> [[2]]
#> sum
#> 1 2
#> 2 4
#> 3 6
#> 4 8
#> 5 10
Created on 2021-11-15 by the reprex package (v2.0.0)