Home > Software design >  Python provides different dtypes for same column
Python provides different dtypes for same column

Time:02-24

I am trying to provide a minimal example soon, but in the meantime: How is it possible, that column "Home Points" is type object and int64 simultaniously? Any hint? Is this a pandas bug?

>>>print(df[["Home Team", "Away Team", "Home Points", "Away Points"]].dtypes)
>>>print()
>>>print(df["Home Points"].describe())
>>>print()
>>>df['Home Points'].unique()

Home Team      object
Away Team      object
Home Points    object
Away Points    object
dtype: object

count     8754
unique       3
top          3
freq      3801
Name: Home Points, dtype: int64

array([3, 1, 0], dtype=object)

CodePudding user response:

It is not. In your first info() you are describing the column within the dataframe, whereas in the output of

df['Home Point'].describe()

You are evaluating the output of said method, which per its documentation:

Returns Series or DataFrame Summary statistics of the Series or Dataframe provided.

Said output is what's being evaluated and considered as int, not the source column for the method. Therefore, it's a completely different object for Python, it just happens that the series has the same name as the column in the original dataframe.

  • Related