I have the following DataFrame, and run df.nunique(axis = 1).
I expect 4 on every row, since they are all different data types.
But the output is quite different, presumably it treats 1, 1.0, True to the same.
Why is this behaviour though?
df = pd.DataFrame({'a': [1, 2] * 3,
'b': [True, False] * 3,
'c': [1.0, 2.0] * 3,
'd': ['one', 'two'] * 3})
df
a b c d
0 1 True 1.0 one
1 2 False 2.0 two
2 1 True 1.0 one
3 2 False 2.0 two
4 1 True 1.0 one
5 2 False 2.0 two
df.dtypes
a int64
b bool
c float64
d object
dtype: object
df.nunique(axis = 1)
0 2
1 3
2 2
3 3
4 2
5 3
dtype: int64
CodePudding user response:
Simple, both 1
and 1.0
represent the same number. Whether they have different types, in this case float
and integer
, python reads them as the same value.
As for why 1.0 == 1, it's because 1.0 and 1 represent the same number. Python doesn't require that two objects have the same type for them to be considered equal. Why in python 1.0==1 true