Home > Mobile >  Concat multiple dataframes from list of dataframe names
Concat multiple dataframes from list of dataframe names

Time:10-01

I would like to concat multiple dataframes into a single dataframe using the names of the dataframes as strings from a list. This is similar to:

df1 = pd.DataFrame({'x': [1, 2, 3], 'y': ['a', 'b', 'c']})    
df2 = pd.DataFrame({'x': [4, 5, 6], 'y': ['d', 'e', 'f']})

pd.concat([df1, df2])

but instead I want to provide a list of dataframe names as strings

For example,

pd.concat(['df1', 'df2'])

Is this possible?

CodePudding user response:

Although using globals and exec answers the question but it is considered bad practise. A better way to do this would be to use a dict likewise:

df_dict = {'df1': df1 , 'df2': df2}

pd.concat(df for _, df in df_dict.items())

CodePudding user response:

Python variable names generally have to be known at compile time, so selecting values from a list of names is tricky. As mentioned in the comments, you could use globals() to get the values from variables in global scope, but a more common practice is to use a dictionary from the beginning instead.

import pandas as pd

dataframes = { 
    "df1":pd.DataFrame({'x': [1, 2, 3], 'y': ['a', 'b', 'c']}),    
    "df2":pd.DataFrame({'x': [4, 5, 6], 'y': ['d', 'e', 'f']}) }    
to_concat = ["df1", "df2"]
result = pd.concat(dataframes[name] for name in to_concat)

Now the dataframes are all tucked neatly into their own namespace instead of being mixed with other stuff in globals. This is especially useful when the dataframes are read dynamically and you'd have to figure out how to get the names into the global space in the first place.

CodePudding user response:

Do you want to use strings as variable names ? if so, you can do :

str_list = ["df1", "df2"]
pd.concat([locals()[str_list[0]], locals()[str_list[1]]])
  • Related