I have a list of dataframes where each dataframe contains the same number of rows. I have a numeric vector (removal
) that specifies how much I want to remove from each dataframe (e.g., remove the last 25% from the first df, df1
).
I can keep the first x% of rows from each df by creating a function and using lapply
, but I don't know how to incorporate that code into a loop that would loop through removal
. Any help is appreciated.
library(dplyr)
df1 <- data.frame(var = sample(1:5, 100, replace = TRUE))
df2 <- data.frame(var = sample(1:5, 100, replace = TRUE))
df3 <- data.frame(var = sample(1:5, 100, replace = TRUE))
lst <- list(df1, df2, df3)
head(df1)
#> var
#> 1 3
#> 2 2
#> 3 2
#> 4 4
#> 5 5
#> 6 2
removal <- c(.25, .30, .50)
# I want to remove the last 25% of rows from df1,
# the last 30% of rows from df2,
# and the last 50% of rows from df3
# I can only make it so I can keep a static percentage from each df
# from the top of each df
# but I don't know to incorporate this code with a loop
# that would loop through `removal`
fake_removal <- .75
fake_remv <- function(x){
x <- x %>% filter(row_number() < nrow(x) * fake_removal)
return(x)
}
badlst <- lapply(lst, fake_remv)
print(nrow(badlst[[1]]))
#> [1] 74
head(badlst[[1]])
#> var
#> 1 3
#> 2 2
#> 3 2
#> 4 4
#> 5 5
#> 6 2
CodePudding user response:
We may need Map/mapply
in base R
or map2
from tidyverse
library(dplyr)
library(purrr)
map2(lst, removal, ~ .x %>%
filter(row_number() < nrow(.x) * .y))
The equivalent Map
option would be
Map(function(dat, rml) subset(dat, seq_len(nrow(dat)) < nrow(dat) * rml),
lst, removal)