The code:
import pandas as pd
import datetime as dt
d1 = pd.to_datetime(df.dates, errors='coerce')
d1 = pd.to_timedelta(d1).dt.days
i have lots of dates in a column which i want to convert to number of days but i simply keep getting errors. im new to pandas so sorry if the question is silly
2014-07-29
2008-11-14
2010-04-20
2011-08-31
2002-07-29
2013-10-29
but why am i getting errors and how to fix them.
TypeError Traceback (most recent call last)
<ipython-input-119-298bf748984e> in <module>()
1 d1 = pd.to_datetime(df.dates, errors='coerce')
----> 2 d1 = pd.to_timedelta(d1).dt.days
2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/tools/timedeltas.py in to_timedelta(arg, unit, errors)
122 return arg
123 elif isinstance(arg, ABCSeries):
--> 124 values = _convert_listlike(arg._values, unit=unit, errors=errors)
125 return arg._constructor(values, index=arg.index, name=arg.name)
126 elif isinstance(arg, ABCIndex):
/usr/local/lib/python3.7/dist-packages/pandas/core/tools/timedeltas.py in _convert_listlike(arg, unit, errors, name)
171
172 try:
--> 173 td64arr = sequence_to_td64ns(arg, unit=unit, errors=errors, copy=False)[0]
174 except ValueError:
175 if errors == "ignore":
/usr/local/lib/python3.7/dist-packages/pandas/core/arrays/timedeltas.py in sequence_to_td64ns(data, copy, unit, errors)
1018 else:
1019 # This includes datetime64-dtype, see GH#23539, GH#29794
-> 1020 raise TypeError(f"dtype {data.dtype} cannot be converted to timedelta64[ns]")
1021
1022 data = np.array(data, copy=copy)
TypeError: dtype datetime64[ns] cannot be converted to timedelta64[ns]
Thank you
CodePudding user response:
To get the number of days to now, use:
today = pd.Timestamp.today().normalize()
d1 = (today - pd.to_datetime(df['dates'], errors='coerce')).dt.days
print(d1)
# Output
0 2910
1 4993
2 4471
3 3973
4 7293
5 3183
Name: dates, dtype: int64
CodePudding user response:
@Corralien's solution is great! I just wanted to add some context for learning purposes:
Notice that it returns a TypeError
because the method timedelta
is meant to be applied to a range of time, a delta. In Pandas "linguo" it is a datetimelike
value. So if you pass regular dates to it, the error you got is somewhat expected.
From the docs, you can see that the TypeError is mentioned right at the bottom.
Try this out to see the types you get:
d1 = pd.to_datetime(df['dates'], errors='coerce')
print(d1.dtype)
print(type(d1))
print(type(d1.iloc[0]))
Then this should be the result:
datetime64[ns]
<class 'pandas.core.series.Series'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
Thus:
TypeError: dtype datetime64[ns] cannot be converted to timedelta64[ns]