I have this data, dtype datetime64[ns]
df.date_month
Output:
0 2018-09-01
1 2018-09-01
2 2018-09-01
3 2018-09-01
4 2018-09-01
...
Name: date_month, Length: 4839993, dtype: datetime64[ns]
If I run a for loop and add pd.Offset, the code runs.
for i in df.date_month[0:10]:
print(i pd.DateOffset(months=12))
Output:
2019-09-01 00:00:00
2019-09-01 00:00:00
2019-09-01 00:00:00
2019-09-01 00:00:00
however, if I use unique(), the code breaks.
for i in df.date_month.unique():
print(i pd.DateOffset(months=12))
Output:
UFuncTypeError Traceback (most recent call last)
<command-3708796390803054> in <module>
1 for i in df.date_month.unique():
----> 2 print(i pd.DateOffset(months=12))
UFuncTypeError: ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('O')
Can someone help me why this happens? Does unique() transform the data in some way?
df.date_month.unique()
Output:
array(['2018-09-01T00:00:00.000000000', '2018-04-01T00:00:00.000000000',
'2018-12-01T00:00:00.000000000', '2018-11-01T00:00:00.000000000',
'2018-07-01T00:00:00.000000000', '2018-05-01T00:00:00.000000000',
'2018-06-01T00:00:00.000000000', '2018-10-01T00:00:00.000000000',
'2018-08-01T00:00:00.000000000'], dtype='datetime64[ns]')
CodePudding user response:
That's correct, unique()
returns an array from the Series
you are passing it. See --> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.unique.html
You more than likely want to use drop_duplicates()
--> https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html
i.e.
for i in df.drop_duplicates(subset=['date_month']).date_month:
print(i pd.DateOffset(months=12))