Home > database >  how to enter manually a Python dataframe with daily dates in a correct format
how to enter manually a Python dataframe with daily dates in a correct format

Time:04-23

I would like to (manually) create in Python a dataframe with daily dates (in column 'date') as per below code. But the code does not provide the correct format for the daily dates, neglects dates (the desired format representation is below). Could you please advise how I can correct the code so that the 'date' column is entered in a desired format? Thanks in advance!

------------------------------------------------------
desired format for date column

2021-03-22    3
2021-04-07    3
2021-04-18    3
2021-05-12    0

------------------------------------------------------
df1 = pd.DataFrame({"date": [2021-3-22, 2021-4-7, 2021-4-18, 2021-5-12],
"x": [3, 3, 3, 0 ]})
df1
    date    x
0   1996    3
1   2010    3
2   1999    3
3   2004    0



CodePudding user response:

You would want to create the series as a datetime and use the following codes when doing so as strings, more info here pandas.to_datetime:

df1 = pd.DataFrame({"date": pd.to_datetime(["2021-3-22", "2021-4-7", "2021-4-18", "2021-5-12"]),
"x": [3, 3, 3, 0 ]})

CodePudding user response:

Python wants to interpret the numbers in the sequence 2021-3-22 as a series of mathematical operations 2021 minus 3 minus 22.

If you want that item to be stored as a string that resembles a date you will need to mark them as string literal datatype (str), as shown below by encapsulating them with quotes.

import pandas as pd

df1 = pd.DataFrame({"date": ['2021-3-22', '2021-4-7', '2021-4-18', '2021-5-12'],
"x": [3, 3, 3, 0 ]})

The results for the date column, as shown here indicate that the date column contains elements of the object datatype which encompasses str in pandas. Notice that the strings were created exactly as shown (2021-3-22 instead of 2021-03-22).

0    2021-3-22
1     2021-4-7
2    2021-4-18
3    2021-5-12
Name: date, dtype: object

IF however, you actually want them stored as datetime objects so that you can do datetime manipulations on them (i.e. determine the number of days between to dates OR filter by a specific month OR year) then you need to convert the values to datetime objects.

This technique will do that:

df1['date'] = pd.to_datetime(df1['date'])

The results of this conversion are Pandas datetime objects which enable nanosecond precision (I differentiate this from Python datetime objects which are limited to microsecond precision).

0   2021-03-22
1   2021-04-07
2   2021-04-18
3   2021-05-12
Name: date, dtype: datetime64[ns]

Notice the displayed results are now formatted just as you would expect of datetimes (2021-03-22 instead of 2021-3-22).

  • Related