I have a list of 108 dataframes, let's say it's called "LDF", and in this list all objects have the same column "VALUE", among others. What I need to do is say to R:
if SUM(VALUE) for each DF of list is greater than 0, maintain this element on the list, otherwhise, drop it.
Basicaly I should have like 104 dataframes in the end of the process
I'm avoiding using for loop. Can someone think of a solution using APPLY?
Was trying:
LDF <- LDF[sapply(LDF$Value, sum) > 0]
but got a 'List of 0' as result
sampled df:
LDF <- list(structure(list(Date = structure(c(18765, 18767, 18778, 18778,
18779, 18787, 18795, 18809, 18809, 18809, 18820, 18821, 18848,
18864, 18871, 18880, 18885, 18886), class = "Date"), Value = c(120000,
40000, 55000, -11.38, -115091.86, 30000, 98400, 1720, 50000,
-50062.58, -2502.82, -20021.71, 28619.27, 45781.12, 14953.83,
-6017.31, -3310.73, -140372.91)), row.names = c(NA, -18L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(Date = structure(c(18820,
18820, 18820, 18820, 18820, 18821, 18857, 18857, 18857, 18857,
18857, 18857, 18858, 18871, 18871, 18887, 18887, 18890, 18890
), class = "Date"), Value = c(41000, 41000, 122754.88, 41000,
41000, 82000, -41080.42, -41432.51, -160308.38, -120504.54, -37214.87,
-76707.98, -42592.41, -41248.63, -41824.33, -120572.42, -37472.61,
-79312, -34830.47)), row.names = c(NA, -19L), class = c("tbl_df",
"tbl", "data.frame")))
CodePudding user response:
We need to extract the column within the loop. LDF
is a list
of data.frame/tibble, thus LDF$Value
doesn't exist
i1 <- sapply(LDF, function(x) sum(x$Value)) > 0
LDF[i1]
-output
[[1]]
# A tibble: 18 x 2
Date Value
<date> <dbl>
1 2021-05-18 120000
2 2021-05-20 40000
3 2021-05-31 55000
4 2021-05-31 -11.4
5 2021-06-01 -115092.
6 2021-06-09 30000
7 2021-06-17 98400
8 2021-07-01 1720
9 2021-07-01 50000
10 2021-07-01 -50063.
11 2021-07-12 -2503.
12 2021-07-13 -20022.
13 2021-08-09 28619.
14 2021-08-25 45781.
15 2021-09-01 14954.
16 2021-09-10 -6017.
17 2021-09-15 -3311.
18 2021-09-16 -140373.
To check the elements that are deleted, negate (!
) the logical vector and check
which(!i1)
gives the position
LDF[!i1]
Or may use Filter
as well
Filter(\(x) sum(x$Value) >0, LDF)
Or with keep
from purrr
library(purrr)
keep(LDF, ~ sum(.x$Value) > 0)
Or the opposite is discard
discard(LDF, ~ sum(.x$Value) > 0)