Home > Enterprise >  Merging three dataframes with three similar indexes:
Merging three dataframes with three similar indexes:

Time:02-22

I have three dataframes

df1 :
Date           ID        Number     ID2   info_df1
2021-12-11     1         34         36       60
2021-12-10     2         33         35       57
2021-12-09     3         32         34       80
2021-12-08     4         3133       55

df2:
Date           ID        Number     ID2   info_df2
2021-12-10     2         18         20       50
2021-12-11     1         34         36       89
2021-12-10     2         33         35       40
2021-12-09     3         32         34       92

df3:
Date           ID        Number     ID2   info_df3
2021-12-10     2         18         20       57
2021-12-10     2         18         20       63
2021-12-11     1         34         36       52
2021-12-10     2         33         35       33

I need a data frame with info column from df1,df2 and df3 and Date,ID,Number,ID2 as index.

Format of the merged dataframe should consist these columns:

Date ID Number ID2 info_df1 info_df2
info_df3

CodePudding user response:

If you trying to merge the dataframe based on Date, I think what you need is merge function:

mergedDf = df1.merge(df2, on="Date").merge(df3, on="Date");
mergedDf.set_index("ID2", inplace = True)

But if you are trying to merge dataframes based on multiple columns, you can use a list of column names on the on argument:

mergedDf = df1.merge(df2, on=["Date", "ID", "ID2"]).merge(df3, on=["Date", "ID", "ID2"]);
mergedDf.set_index("ID2", inplace = True)

CodePudding user response:

Two steps:

  1. first, pandas.concat(<DFs-list>) all those DFs into a df;
  2. then, define a multi-index with df.set_index(<col-names-list>).

That will do it. Sure, you have to read some docs (here below), but those two steps should be about it.

CodePudding user response:

As others have mentioned, you need to merge the dataframes together. Using the built-in function functools.reduce, we can do this dynamically (for any number of dataframes) and easily:

i = 0

def func(x, y):
    global i
    i  = 1
    return y.merge(x.rename({'info': f'info_df{i   1}'}, axis=1), on=['Date', 'ID', 'Number', 'ID2'], how='outer')
    
dfs = [df1, df2, df3]
new_df = ft.reduce(func, dfs).rename({'info': 'info_df1'}, axis=1)

Output:

>>> new_df
         Date  ID  Number  ID2  info_df1  info_df7  info_df6
0  2021-12-10   2      18   20      57.0      50.0       NaN
1  2021-12-10   2      18   20      63.0      50.0       NaN
2  2021-12-11   1      34   36      52.0      89.0      60.0
3  2021-12-10   2      33   35      33.0      40.0      57.0
4  2021-12-09   3      32   34       NaN      92.0      80.0
5  2021-12-08   4      31   33       NaN       NaN      55.0
  • Related