The following code is supposedly creating two identical data frames, but the test for equality returns False:
import pandas as pd
df1 = pd.DataFrame(columns=["A"])
df2 = pd.DataFrame({"A": []})
print(df1)
print(df2)
print(df1.equals(df2))
Here is the output produced by the code above:
Command Line Arguments
Empty DataFrame
Columns: [A]
Index: []
Empty DataFrame
Columns: [A]
Index: []
False
Why does df1.equals(df2)
return False?
CodePudding user response:
There is a method for testing equality with more detail:
import pandas as pd
from pandas.testing import assert_frame_equal
df1 = pd.DataFrame(columns=["A"])
df2 = pd.DataFrame({"A": []})
assert_frame_equal(df1,df2)
Output
DataFrame.index classes are not equivalent
[left]: Index([], dtype='object')
[right]: RangeIndex(start=0, stop=0, step=1)
Then
assert_frame_equal(df1.reset_index(drop=True),df2.reset_index(drop=True))
Output
Attribute "dtype" are different
[left]: object
[right]: float64
Finally, this will get you there
df1.reset_index(drop=True).equals(df2.astype(object).reset_index(drop=True))