Home > database >  Compare two Dataframe columns and retrive the mismatched columns in a new column
Compare two Dataframe columns and retrive the mismatched columns in a new column

Time:12-09

I am trying to compare two dataframes and create one column in the first dataframe says that mismatched columns. attaching the Dataframe images and expected output images below. Could I get some help here

enter image description here attaching the expected outputs here.Need one column says that names of all mismatched columns

CodePudding user response:

Assuming aligned indices, you can use:

df1['mismatch'] = df1.ne(df2).dot(df1.columns ',').str[:-1]

Alternatively:

out = (df1.ne(df2).reset_index().melt('index')
       .query('value').groupby('index')
       ['variable'].agg(','.join)
      )

df1['mismatch'] = out

Example output:

   A  B  C mismatch
0  1  2  3      A,B
1  4  5  6        A
2  7  8  9         

Used inputs:

# df1
   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9

# df2
   A  B  C
0  0  0  3
1  0  5  6
2  7  8  9

CodePudding user response:

Applying two lambda loops first to check if cell values equal or not second to find out the column names with true value and join them. Alternatively, instead of second lambda, You can even use .dot(). as @mozway has done.

Code:

df1.apply(lambda x: x!=df2.iloc[x.index][x.name]).apply(lambda x: ','.join(x.index[x]), axis=1)
  • Related