I have a list that includes different databases with different informations. The first column of every database includes the informations that I need to create graphs. I need to filter information based on external vector referred to first column.
For example:
mtcars2 <- mtcars %>% rownames_to_column("cars_model") %>% as.data.frame()
mtcars3 <- mtcars %>% rownames_to_column("cars_model_second") %>% as.data.frame()
list_two_database <- list(mtcars2, mtcars3)
model_to_select <- c("Fiat 128", "Honda Civic", "Lotus Europa")
Is there a way to filter the list based on THE FIRST COLUMN OF EACH DATABASE included in the list (cars_model and cars_model_second) WITHOUT RENAME THE COLUMN ITSELF?
The goal is to obtain a list that includes the two databases each with the three model.
Thank you in advance
CodePudding user response:
The following works by extracting the first column name as a string first_col
and then converting this string into a form that can be used within dplyr:
mtcars2 <- mtcars %>% rownames_to_column("cars_model") %>% as.data.frame()
mtcars3 <- mtcars %>% rownames_to_column("cars_model_second") %>% as.data.frame()
list_two_database <- list(mtcars2, mtcars3)
model_to_select <- c("Fiat 128", "Honda Civic", "Lotus Europa")
func = function(df){
first_col = colnames(df)[1]
filter(df, !!sym(first_col) %in% model_to_select)
}
lapply(list_two_database, func)