I had a large data frame that I grouped and then split into a list of over 400 vectors. There are some tibbles within this data frame that have one column with only 0's as entries and I would like to somehow remove these entries from list or data frame.
A smaller sample of what my data looks like can be seen here:
dfa <- data.frame(intensity.x = c(10, 20, 100, 30 , 40), intensity.y = c(100, 30, 0.0, 20, 0), group = c('a', 'a', 'a', 'a', 'a'))
dfb <- data.frame(intensity.x = c(100, 10, 45, 60 , 43), intensity.y = c(0, 0, 0, 0, 0), group = c('b', 'b', 'b', 'b', 'b'))
dfx <- data.frame(intensity.x = c(20, 4, 5, 16 , 3), intensity.y = c(0, 12, 0, 1, 0), group = c('x', 'x', 'x', 'x', 'x'))
dfy <- data.frame(intensity.x = c(10, 10, 30, 20 , 80), intensity.y = c(0, 0, 0, 0, 0), group = c('y', 'y', 'y', 'y', 'y'))
df.big <- rbind(dfa, dfb, dfx, dfy)
df.list <- list(dfa, dfb, dfx, dfy)
Essentially I want groups like dfy and dfb to be filtered out of my large data frame (df.big) or the kist (df.list) because all of their intensity.y values are 0, but I can't use
filter(df.big$intensity.y != 0)
Because that would then remove the values from groups df and dfz which I want to maintain.
Is this possible?
CodePudding user response:
Yes, you can do:
df.list[sapply(df.list, function(df) !all(df$intensity.y == 0))]
#> [[1]]
#> intensity.x intensity.y group
#> 1 10 100 a
#> 2 20 30 a
#> 3 100 0 a
#> 4 30 20 a
#> 5 40 0 a
#>
#> [[2]]
#> intensity.x intensity.y group
#> 1 20 0 x
#> 2 4 12 x
#> 3 5 0 x
#> 4 16 1 x
#> 5 3 0 x
Created on 2022-11-21 with reprex v2.0.2
CodePudding user response:
Using base R
with Filter
Filter(\(x) any(x$intensity.y != 0), df.list)
[[1]]
intensity.x intensity.y group
1 10 100 a
2 20 30 a
3 100 0 a
4 30 20 a
5 40 0 a
[[2]]
intensity.x intensity.y group
1 20 0 x
2 4 12 x
3 5 0 x
4 16 1 x
5 3 0 x
CodePudding user response:
Alternative using purrr
:
df.list |> purrr::keep(~dplyr::summarise(.x, sum(intensity.y)) != 0)