I have two dataframes which are uneven in the number of rows. Now I want to add them horizontally by aligning the second dataframe based on a key ("flag"). However, the flag serves merely as a connector at a specific row to the first (base) df which means the second dataframe should be pasted at that connector point. Please see visual for what I mean in case it is not clear.
I tried looking into merge, concat, join etc
but it will join it does not seem quiet like what I am looking for.
CodePudding user response:
dif = df1['flag'].idxmax() - df2['flag'].idxmax()
df2.index = df2.index dif
df1.merge(df2,how='outer',left_index=True,right_index=True)
Can make use of the above idea. Need to clean up the column names and drop the extra column. Works if dif is -ve.
Drop 1st 3 rows of df1
CodePudding user response:
You can try:
df_base = pd.DataFrame(data={'flag':[0,0,0,0,1,0,0],
'transaction_value':[1,1,2,2,5,6,9]})
df_group2 = pd.DataFrame(data={'flag':[0,0,1,0,0],
'transaction_value':[1,1,2,2,5]})
diff = df_base['flag'].argmax() - df_group2['flag'].argmax()
df_group2.index = df_group2.index diff
print(df_base.join(df_group2[['transaction_value']], rsuffix='_group2'))
Output:
flag transaction_value transaction_value_group2
0 0 1 NaN
1 0 1 NaN
2 0 2 1.0
3 0 2 1.0
4 1 5 2.0
5 0 6 2.0
6 0 9 5.0