I am trying to convert pandas Dataframe to Series based on accepted answer to Convert dataframe to series for multiple column
However I am getting NaN in my integer column 'y'.
Here is my code:
data = [['2021-10-14 18:12:00.000', '22811316'],['2021-10-14 18:42:00.000', '22700704']]
df = pd.DataFrame(data, columns = ['ds', 'y'])
series = pd.Series(df.y, index=df.ds)
printing series gives me:
ds
2021-10-14 18:12:00.000 NaN
2021-10-14 18:42:00.000 NaN
Name: y, dtype: object
What am I missing?
CodePudding user response:
I could find the answer in pandas.Series() Creation using DataFrame Columns returns NaN Data entries
The trick was to use:
series = pd.Series(df.y.values, index=df.ds)
CodePudding user response:
If you just take the series df.y
, you will obtain a series with new indices starting from 0, 1, ...
print(df.y)
0 22811316
1 22700704
Name: y, dtype: object
These indices do not match with the values of the column ds
that you want to use as index.
So, when you create the new series with index=...
, you will probably have all NaN
.
In order to put just the values of y
column into the new series, you have to take only its values using to_numpy()
series = pd.Series(df.y.to_numpy(), index=df.ds)
print(series)
ds
2021-10-14 18:12:00.000 22811316
2021-10-14 18:42:00.000 22700704
dtype: object