Home > Blockchain >  Compare two different dimensions dataframes
Compare two different dimensions dataframes

Time:03-18

I have two dataframes df1 and df2, I want to compare both dataframes and the result will be saved in df3 as shown below import pandas as pd df1=pd.read_excel('Original.xlsx') df2=pd.read_excel('Changed.xlsx')

df1 enter image description here

df2 enter image description here

i want output like this enter image description here

CodePudding user response:

Your question is a bit hard to understand, and I think you should use code blocks to display df1, df2 and df3.

From my point of view, maybe you can use df.filter() and df.concat() to achieve what you want.

CodePudding user response:

pivot both dataframe

df1_pv = df1.pivot_table(index='ID', columns='Col', values='Val', aggfunc=max).reset_index()
df2_pv = df2.pivot_table(index='ID', columns='Col', values='Val', aggfunc=max).reset_index()

It results in the same columns structure: ID, Col, Val and it could compare changes.

new_id = df2_pv[~df2_pv['ID'].isin(df1_pv['ID'])] 
new_col = df2_pv[~df2_pv['Col'].isin(df1_pv['Col'])] 
val_change = pd.merge(df1_pv, df2_pv, on=['ID', 'Col'], suffixes=('_1', '_2')).query('`Val_1` != `Val_2`')
  • Related