Home > Net >  pandas.DataFrame.nunique on different data types
pandas.DataFrame.nunique on different data types

Time:08-01

I have the following DataFrame, and run df.nunique(axis = 1). I expect 4 on every row, since they are all different data types.
But the output is quite different, presumably it treats 1, 1.0, True to the same. Why is this behaviour though?

df = pd.DataFrame({'a': [1, 2] * 3,
                   'b': [True, False] * 3,
                   'c': [1.0, 2.0] * 3,
                   'd': ['one', 'two'] * 3})

 df               
    a   b       c   d
0   1   True    1.0 one
1   2   False   2.0 two
2   1   True    1.0 one
3   2   False   2.0 two
4   1   True    1.0 one
5   2   False   2.0 two                   
                                    
df.dtypes

a      int64
b       bool
c    float64
d     object
dtype: object

df.nunique(axis = 1)

0    2
1    3
2    2
3    3
4    2
5    3
dtype: int64

CodePudding user response:

Simple, both 1 and 1.0 represent the same number. Whether they have different types, in this case float and integer, python reads them as the same value.

As for why 1.0 == 1, it's because 1.0 and 1 represent the same number. Python doesn't require that two objects have the same type for them to be considered equal. Why in python 1.0==1 true

  • Related