I am trying to provide a minimal example soon, but in the meantime: How is it possible, that column "Home Points" is type object and int64 simultaniously? Any hint? Is this a pandas bug?
>>>print(df[["Home Team", "Away Team", "Home Points", "Away Points"]].dtypes)
>>>print()
>>>print(df["Home Points"].describe())
>>>print()
>>>df['Home Points'].unique()
Home Team object
Away Team object
Home Points object
Away Points object
dtype: object
count 8754
unique 3
top 3
freq 3801
Name: Home Points, dtype: int64
array([3, 1, 0], dtype=object)
CodePudding user response:
It is not. In your first info()
you are describing the column within the dataframe, whereas in the output of
df['Home Point'].describe()
You are evaluating the output of said method, which per its documentation:
Returns Series or DataFrame Summary statistics of the Series or Dataframe provided.
Said output is what's being evaluated and considered as int
, not the source column for the method. Therefore, it's a completely different object for Python, it just happens that the series has the same name as the column in the original dataframe.