Home > Enterprise >  Merge every N dataframes in a list
Merge every N dataframes in a list

Time:09-22

I have a list with dataframes. I want to merge every 6 dataframes. The way I m doing it is very manual, so I am doing:

from functools import reduce
import paandas as pd

list1 = bigList[0:5]
list1DF = reduce(lambda df1,df2: pd.merge(df1,df2,on='index', how = 'outer'), list1)

list2 = bigList[6:11]
list2DF = reduce(lambda df1,df2: pd.merge(df1,df2,on='index', how = 'outer'), list2)

and so on, until I merge all of them and then concat,

full_df = pd.concat([list1DF, list2DF, ...])

Any tips as to how I can automate this?

CodePudding user response:

Sure – just grab chunks of your big list with a step of 5 and apply what you've been doing anyway:

from functools import reduce

big_list = ...
smaller_list = []

for idx in range(0, len(big_list), 5):
    chunk = big_list[idx:idx   5]
    combined_df = reduce(lambda df1, df2: pd.merge(df1, df2, on='index', how='outer'), chunk)
    smaller_list.append(combined_df)

full_df = pd.concat(smaller_list)

CodePudding user response:

When you must generate many objects, the ideal is to not give them dummy names (obj1, obj2, obj3…).

Collect everything in a list:

N = 5
list_DFs = [reduce(lambda df1,df2: pd.merge(df1, df2, on='index', how='outer'),
                                            bigList[i:i N])
            for l in range(0, len(bigList), N)]

full_df = pd.concat(list_DFs)
  • Related