Home > Software design >  Filtering "names" of dataframes in a list of dataframes (R, dplyr)
Filtering "names" of dataframes in a list of dataframes (R, dplyr)

Time:01-25

df_list contains a list of the following dataframes:

$DF_1 $DF_2 $DF_3 $DF_4 $DF_5

Is there a way to remove dataframes from the list based on a condition on the "name" of the dataframe? (I understand that DF_1 isn't necessarily an attribute of the first dataframe; it is just the way to call the dataframe.)

For instance, I'm looking for a way to filter for dataframes with an odd number in the "name" (i.e., DF_1, DF_3, DF_5).

I've tried to work with the "names" of the dataframes, but I'm having trouble. I can only access the column names within each dataframe.

To summarize, I'm looking for a way to select dataframes within a list of dataframes based on a condition (not manually). Thank you so much in advance!

CodePudding user response:

Sure, extract the numbers from the names and then write your test to pass to [:

library(stringr)
name_nums = df_list |> names() |> str_extact("[0-9] ") |> as.integer()
odd_list = df_list[name_nums %% 2 == 1]

CodePudding user response:

Here is one approach, it works on the names of the list and the list as a whole, not on the individual elements of the list.

df_list_2 <- df_list[ grepl('[13579]$', names(df_list) ) ]

This uses regular expressions and the fact that odd numbers end with an odd digit. You could also extract the number at the end and use %% to determine odd/even, then subset the list with that.

CodePudding user response:

Another approach using sub

df_list[as.numeric(sub(".*_(\\d )$", "\\1", names(df_list))) %% 2 == 1]
$DF_1
# A tibble: 3 × 1
      a
  <int>
1     1
2     2
3     3

$DF_3
# A tibble: 3 × 1
      a
  <int>
1     1
2     2
3     3

$DF_5
# A tibble: 3 × 1
      a
  <int>
1     1
2     2
3     3

Data

df_list <- list(DF_1 = structure(list(a = 1:3), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -3L)), DF_2 = structure(list(
    a = 1:3), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L)), DF_3 = structure(list(a = 1:3), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -3L)), DF_4 = structure(list(
    a = 1:3), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L)), DF_5 = structure(list(a = 1:3), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -3L)))
  • Related