Home > Enterprise >  Remove elements from a list by condition
Remove elements from a list by condition

Time:09-23

I have a list of 108 dataframes, let's say it's called "LDF", and in this list all objects have the same column "VALUE", among others. What I need to do is say to R:

if SUM(VALUE) for each DF of list is greater than 0, maintain this element on the list, otherwhise, drop it.

Basicaly I should have like 104 dataframes in the end of the process

I'm avoiding using for loop. Can someone think of a solution using APPLY?

Was trying:

LDF <- LDF[sapply(LDF$Value, sum) > 0]

but got a 'List of 0' as result

sampled df:

LDF <- list(structure(list(Date = structure(c(18765, 18767, 18778, 18778, 
18779, 18787, 18795, 18809, 18809, 18809, 18820, 18821, 18848, 
18864, 18871, 18880, 18885, 18886), class = "Date"), Value = c(120000, 
40000, 55000, -11.38, -115091.86, 30000, 98400, 1720, 50000, 
-50062.58, -2502.82, -20021.71, 28619.27, 45781.12, 14953.83, 
-6017.31, -3310.73, -140372.91)), row.names = c(NA, -18L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(Date = structure(c(18820, 
18820, 18820, 18820, 18820, 18821, 18857, 18857, 18857, 18857, 
18857, 18857, 18858, 18871, 18871, 18887, 18887, 18890, 18890
), class = "Date"), Value = c(41000, 41000, 122754.88, 41000, 
41000, 82000, -41080.42, -41432.51, -160308.38, -120504.54, -37214.87, 
-76707.98, -42592.41, -41248.63, -41824.33, -120572.42, -37472.61, 
-79312, -34830.47)), row.names = c(NA, -19L), class = c("tbl_df", 
"tbl", "data.frame")))

CodePudding user response:

We need to extract the column within the loop. LDF is a list of data.frame/tibble, thus LDF$Value doesn't exist

i1 <- sapply(LDF, function(x) sum(x$Value)) > 0
LDF[i1]

-output

[[1]]
# A tibble: 18 x 2
   Date           Value
   <date>         <dbl>
 1 2021-05-18  120000  
 2 2021-05-20   40000  
 3 2021-05-31   55000  
 4 2021-05-31     -11.4
 5 2021-06-01 -115092. 
 6 2021-06-09   30000  
 7 2021-06-17   98400  
 8 2021-07-01    1720  
 9 2021-07-01   50000  
10 2021-07-01  -50063. 
11 2021-07-12   -2503. 
12 2021-07-13  -20022. 
13 2021-08-09   28619. 
14 2021-08-25   45781. 
15 2021-09-01   14954. 
16 2021-09-10   -6017. 
17 2021-09-15   -3311. 
18 2021-09-16 -140373. 

To check the elements that are deleted, negate (!) the logical vector and check

which(!i1)

gives the position

LDF[!i1]

Or may use Filter as well

Filter(\(x) sum(x$Value) >0, LDF)

Or with keep from purrr

library(purrr)
keep(LDF, ~ sum(.x$Value) > 0)

Or the opposite is discard

discard(LDF, ~ sum(.x$Value) > 0)
  • Related