Home > Net >  Where can I find a complete list of acceptable pandas.Timestamp format?
Where can I find a complete list of acceptable pandas.Timestamp format?

Time:07-03

I find that pandas.Timestamp is extremely powerful and flexible parsing tool that accepts a wide range of timestamp/datetime formats. E.g.

In [38]: pd.Timestamp('2020')
Out[38]: Timestamp('2020-01-01 00:00:00')

In [39]: pd.Timestamp('2020-02')
Out[39]: Timestamp('2020-02-01 00:00:00')

In [40]: pd.Timestamp('2020Q1')
Out[40]: Timestamp('2020-01-01 00:00:00')

But it doesn't always do the "magic" I was expecting, e.g. the followings are illegal:

In [41]: pd.Timestamp('202003')  # expecting 2020-03-01
ValueError: could not convert string to Timestamp

In [42]: pd.Timestamp('2020H2')  # expecting 2020-07-01, i.e. 2020 second half (start)
ValueError: could not convert string to Timestamp

I tried to find a complete list of supported formats but it seems that the document is missing (or I'm missing something). Can anyone help? Thanks!

CodePudding user response:

pd.timestamp() is the pandas equivalent of python’s Datetime and is interchangeable with it in most cases. Datetime library accepts ISO 8601 date formats.

In Python ISO 8601 date is represented in YYYY-MM-DDTHH:MM:SS.mmmmmm format. For example, May 18, 2022, is represented as 2022-05-18T11:40:22.519222.

  • YYYY: Year in four-digit format
  • MM: Months from 1-12 DD: Days from 1 to 31
  • T: It is the separator character that is to be printed between the date and time fields. It is an optional parameter having a default value of “T”.
  • HH: For the value of minutes
  • MM: For the specified value of minutes
  • SS: For the specified value of seconds
  • mmmmmm: For the specified microseconds

Source

Directly from the Pandas documentation (here):

There are essentially three calling conventions for the constructor. The primary form accepts four parameters. They can be passed by position or keyword.

The other two forms mimic the parameters from datetime.datetime. They can be passed by either position or keyword, but not both mixed together.

Examples

Using the primary calling convention: This converts a datetime-like string

>>> pd.Timestamp('2017-01-01T12')
Timestamp('2017-01-01 12:00:00')

This converts a float representing a Unix epoch in units of seconds

>>> pd.Timestamp(1513393355.5, unit='s')
Timestamp('2017-12-16 03:02:35.500000')

This converts an int representing a Unix-epoch in units of seconds and for a particular timezone

>>> pd.Timestamp(1513393355, unit='s', tz='US/Pacific')
Timestamp('2017-12-15 19:02:35-0800', tz='US/Pacific')

Using the other two forms that mimic the API for datetime.datetime:

>>> pd.Timestamp(2017, 1, 1, 12)
Timestamp('2017-01-01 12:00:00')

>>> pd.Timestamp(year=2017, month=1, day=1, hour=12)
Timestamp('2017-01-01 12:00:00')

CodePudding user response:

Use quarters if looking into annual periods

df=pd.DataFrame({'date':['2020-Q1']})
pd.PeriodIndex(df['date'], freq='Q').to_timestamp()

and for dates

import pandas as pd
pd.to_datetime('202003', format='%Y%m')
  • Related