What is wrong with this?
pd.to_datetime('2022-01-01',unit='D')
If I do it without the unit
pd.to_datetime('2022-01-01')
no error is raised. However, insted of the standard unit ns
I rather want D
.
CodePudding user response:
The issue is that the unit parameter in the pandas.to_datetime() function specifies the unit of the input date/time data, not the output format.
To specify the output format of the resulting datetime object, you can use the format parameter instead.
dt = pd.to_datetime('2022-01-01', format='%Y-%m-%d')
CodePudding user response:
There is a quite clear description and examples on the
So, feels legit, does not it?
Let's try it on some different unit, e.g. s
which stands for seconds:
pd.to_datetime([1, 2, 3], unit='D',
origin=pd.Timestamp('1960-01-01'))
Output:
DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'], dtype='datetime64[ns]', freq=None)
What has happened here? Basically we are taking origin
as the base date, and this list
in the beginning as a… multiplier? By unit='D'
we set it to days, no problem, let's see how it behaves on a different list
:
pd.to_datetime([0, 30, 64], unit='s',
origin=pd.Timestamp('1960-01-01'))
Output:
DatetimeIndex(['1960-01-01 00:00:00', '1960-01-01 00:00:30',
'1960-01-01 00:01:04'],
dtype='datetime64[ns]', freq=None)
That was expected. Basically same thing, we are rather taking the base value, or add 30 seconds or get 00:01:04 by adding 64 seconds
To sum it up
You are misusing this unit=
key, it's meant to add up to the base datetime by providing a list
of values of how much you want to add up. Your date should be featured in origin=
key as origin='2022-01-01'
.
If you don't want this functionality and you want to cast this value to a day
, than look at the other answer. Basically:
pd.to_datetime('2022-01-01', format='%Y-%m-%d').day
Output:
1
One is the first day of Jan 20222.
CodePudding user response:
The error in the code is that the unit parameter of the pd.to_datetime() function expects a string representing the time unit, but you have passed it the integer value 'D' instead. In this case, the function will try to interpret the integer value as a string and will raise a TypeError because it cannot convert the integer to a valid time unit.
To fix this error, you need to pass the unit parameter a string value instead of an integer. For example, you could use the following code to specify the D time unit:
pd.to_datetime('2022-01-01',unit='D')
Or you could use the 'days' string to specify the same time unit:
pd.to_datetime('2022-01-01',unit='days')
In either case, the pd.to_datetime() function will correctly interpret the time unit and convert the date string to a datetime object. It is worth noting that the default time unit for the pd.to_datetime() function is 'ns', which stands for nanoseconds, so if you do not specify a time unit, the function will use this default value.