Home > Software engineering >  how to change suffix on new df column of df when merging iteratively
how to change suffix on new df column of df when merging iteratively

Time:12-20

I have a temp df and a a dflst. the temp has as columns the unique col names from a dataframes in a dflst . The dflst has a dynamic len, my problem arrises when len(dflst)>=4. aLL DFs (temp and all the ones in dflst) have columns with true/false values and a p column with numbers

code to recreate data:

#making temp df
var_cols=['a', 'b', 'c', 'd']
temp = pd.DataFrame(list(itertools.product([False, True], repeat=len(var_cols))), columns=var_cols)

#makinf dflst
df0=pd.DataFrame(list(itertools.product([False, True], repeat=len(['a', 'b']))), columns=['a', 'b'])
df0['p']= np.random.randint(1, 5, df0.shape[0])
df1=pd.DataFrame(list(itertools.product([False, True], repeat=len(['c', 'd']))), columns=['c', 'd'])
df1['p']= np.random.randint(1, 5, df1.shape[0])
df2=pd.DataFrame(list(itertools.product([False, True], repeat=len(['a', 'c', ]))), columns=['a', 'c'])
df2['p']= np.random.randint(1, 5, df2.shape[0])
df3=pd.DataFrame(list(itertools.product([False, True], repeat=len(['d']))), columns=['d'])
df3['p']= np.random.randint(1, 5, df3.shape[0])
dflst=[df0, df1, df2, df3]

I want to merge the dfs in dflst, so that the 'p'col values from dfs in dflst into temp df, in the rows with compatible values between the two .

I am currently doing it with pd.merge as follows:

for df in dflst:
    temp = temp.merge(df, on=list(df)[:-1], how='right')

but this results to a df that has same names for different columns, when dflst has 4 or more dfs.. I understand that that is due to suffix of merge. but it creates problems with column indexing.

How can I have unique names on the new columns added to temp iteratively?

CodePudding user response:

I don't fully understand what you want but IIUC:

for i, df in enumerate(dflst):
    temp = temp.merge(df.rename(columns={'p': f'p{i}'}),
                      on=df.columns[:-1].tolist(), how='right')
print(temp)

# Output:
        a      b      c      d  p0  p1  p2  p3
0   False  False  False  False   4   2   2   1
1   False   True  False  False   3   2   2   1
2   False  False   True  False   4   3   4   1
3   False   True   True  False   3   3   4   1
4    True  False  False  False   3   2   2   1
5    True   True  False  False   3   2   2   1
6    True  False   True  False   3   3   1   1
7    True   True   True  False   3   3   1   1
8   False  False  False   True   4   4   2   3
9   False   True  False   True   3   4   2   3
10  False  False   True   True   4   1   4   3
11  False   True   True   True   3   1   4   3
12   True  False  False   True   3   4   2   3
13   True   True  False   True   3   4   2   3
14   True  False   True   True   3   1   1   3
15   True   True   True   True   3   1   1   3
  • Related