Home > OS >  When comparing two Pandas dataframes how to ignore false result on NaN?
When comparing two Pandas dataframes how to ignore false result on NaN?

Time:03-31

I first load two csv files into dataframes. There are some empty values in the dataset as expected:

df1

df1

df2

df2

When I try to compare the dataframes with pandas.DataFrame.eq method, I am also getting False for NaN values.

df_1.eq(df_2)

Result

result

How can I ignore the false result for NaN values?

CodePudding user response:

Repalce missing values by same value - here same string:

df_1.fillna('same').eq(df_2.fillna('same'))

CodePudding user response:

Update:

If you compare float values, you shouldn't use eq but np.isclose and set the tolerance 6.999999999999999 and 7.0 are not strictly equal but 6.9999999999999999 and 7.0 are equal (one more 9).

You can use:

np.isclose(df1, df2, equal_nan=True)

Old answer

The code below works only on pd.Series not on pd.DataFrame.

Use fill_value parameter of eq method:

df_1.eq(df_2, fill_value=np.inf)
  • Related