Home > Mobile >  How to filter databases in list based on column with different name
How to filter databases in list based on column with different name

Time:05-31

I have a list that includes different databases with different informations. The first column of every database includes the informations that I need to create graphs. I need to filter information based on external vector referred to first column.

For example:

mtcars2 <- mtcars %>% rownames_to_column("cars_model") %>% as.data.frame()
mtcars3 <- mtcars %>% rownames_to_column("cars_model_second") %>% as.data.frame()
list_two_database <- list(mtcars2, mtcars3)

model_to_select <- c("Fiat 128", "Honda Civic", "Lotus Europa")

Is there a way to filter the list based on THE FIRST COLUMN OF EACH DATABASE included in the list (cars_model and cars_model_second) WITHOUT RENAME THE COLUMN ITSELF?

The goal is to obtain a list that includes the two databases each with the three model.

Thank you in advance

CodePudding user response:

The following works by extracting the first column name as a string first_col and then converting this string into a form that can be used within dplyr:

mtcars2 <- mtcars %>% rownames_to_column("cars_model") %>% as.data.frame()
mtcars3 <- mtcars %>% rownames_to_column("cars_model_second") %>% as.data.frame()
list_two_database <- list(mtcars2, mtcars3)

model_to_select <- c("Fiat 128", "Honda Civic", "Lotus Europa")

func = function(df){
  first_col = colnames(df)[1]
  
  filter(df, !!sym(first_col) %in% model_to_select)
}

lapply(list_two_database, func)
  • Related