Home > Back-end >  How to combine two dataframes of unequal lenght on a key as connector?
How to combine two dataframes of unequal lenght on a key as connector?

Time:10-07

I have two dataframes which are uneven in the number of rows. Now I want to add them horizontally by aligning the second dataframe based on a key ("flag"). However, the flag serves merely as a connector at a specific row to the first (base) df which means the second dataframe should be pasted at that connector point. Please see visual for what I mean in case it is not clear.

I tried looking into merge, concat, join etc but it will join it does not seem quiet like what I am looking for.

enter image description here

CodePudding user response:

dif = df1['flag'].idxmax() - df2['flag'].idxmax()
df2.index = df2.index   dif
df1.merge(df2,how='outer',left_index=True,right_index=True)

enter image description here

Can make use of the above idea. Need to clean up the column names and drop the extra column. Works if dif is -ve.

Drop 1st 3 rows of df1

enter image description here

CodePudding user response:

You can try:

df_base = pd.DataFrame(data={'flag':[0,0,0,0,1,0,0],
                             'transaction_value':[1,1,2,2,5,6,9]})
df_group2 = pd.DataFrame(data={'flag':[0,0,1,0,0],
                               'transaction_value':[1,1,2,2,5]})

diff = df_base['flag'].argmax() - df_group2['flag'].argmax()
df_group2.index = df_group2.index   diff
print(df_base.join(df_group2[['transaction_value']], rsuffix='_group2'))

Output:

   flag  transaction_value  transaction_value_group2
0     0                  1                       NaN
1     0                  1                       NaN
2     0                  2                       1.0
3     0                  2                       1.0
4     1                  5                       2.0
5     0                  6                       2.0
6     0                  9                       5.0
  • Related