I have two dataframes df1 and df2, I want to compare both dataframes and the result will be saved in df3 as shown below import pandas as pd df1=pd.read_excel('Original.xlsx') df2=pd.read_excel('Changed.xlsx')
df1 enter image description here
df2 enter image description here
i want output like this enter image description here
CodePudding user response:
Your question is a bit hard to understand, and I think you should use code blocks to display df1, df2 and df3.
From my point of view, maybe you can use df.filter() and df.concat() to achieve what you want.
CodePudding user response:
pivot both dataframe
df1_pv = df1.pivot_table(index='ID', columns='Col', values='Val', aggfunc=max).reset_index()
df2_pv = df2.pivot_table(index='ID', columns='Col', values='Val', aggfunc=max).reset_index()
It results in the same columns structure: ID, Col, Val and it could compare changes.
new_id = df2_pv[~df2_pv['ID'].isin(df1_pv['ID'])]
new_col = df2_pv[~df2_pv['Col'].isin(df1_pv['Col'])]
val_change = pd.merge(df1_pv, df2_pv, on=['ID', 'Col'], suffixes=('_1', '_2')).query('`Val_1` != `Val_2`')