Home > Enterprise >  How to remove hours, minutes, seconds and UTC offset from pandas date column? I'm running with
How to remove hours, minutes, seconds and UTC offset from pandas date column? I'm running with

Time:09-23

How to remove T00:00:00 05:30 after year, month and date values in pandas? I tried converting the column into datetime but also it's showing the same results, I'm using pandas in streamlit. I tried the below code

df['Date'] = pd.to_datetime(df['Date'])

The output is same as below :

Date
2019-07-01T00:00:00 05:30
2019-07-01T00:00:00 05:30
2019-07-02T00:00:00 05:30
2019-07-02T00:00:00 05:30
2019-07-02T00:00:00 05:30
2019-07-03T00:00:00 05:30
2019-07-03T00:00:00 05:30
2019-07-04T00:00:00 05:30
2019-07-04T00:00:00 05:30
2019-07-05T00:00:00 05:30

Can anyone help me how to remove T00:00:00 05:30 from the above rows?

CodePudding user response:

Don't bother with apply to Python dates or string changes. The former will leave you with an object type column and the latter is slow. Just round to the day frequency using the library function.

>>> pd.Series([pd.Timestamp('2000-01-05 12:01')]).dt.round('D')
0   2000-01-06
dtype: datetime64[ns]

If you have a timezone aware timestamp, convert to UTC with no time zone then round:

>>> pd.Series([pd.Timestamp('2019-07-01T00:00:00 05:30')]).dt.tz_convert(None) \
        .dt.round('D')
0   2019-07-01
dtype: datetime64[ns]

CodePudding user response:

If I understand correctly, you want to keep only the date part.

Convert date strings to datetime

df = pd.DataFrame(
    columns={'date'},
    data=["2019-07-01T02:00:00 05:30", "2019-07-02T01:00:00 05:30"]
)
                        date
0  2019-07-01T02:00:00 05:30
1  2019-07-02T01:00:00 05:30
2  2019-07-03T03:00:00 05:30

df['date'] = pd.to_datetime(df['date'])

                       date
0 2019-07-01 02:00:00 05:30
1 2019-07-02 01:00:00 05:30

Remove the timezone

df['datetime'] = df['datetime'].dt.tz_localize(None)

                 date
0 2019-07-01 02:00:00
1 2019-07-02 01:00:00

Keep the date only

df['date'] = df['date'].dt.date

0    2019-07-01
1    2019-07-02

CodePudding user response:

Pandas doesn't have a builtin conversion to datetime.date, but you could use .apply to achieve this if you want to have date objects instead of string:

import pandas as pd
import datetime

df = pd.DataFrame(
    {"date": [
        "2019-07-01T00:00:00 05:30",
        "2019-07-01T00:00:00 05:30",
        "2019-07-02T00:00:00 05:30",
        "2019-07-02T00:00:00 05:30",
        "2019-07-02T00:00:00 05:30",
        "2019-07-03T00:00:00 05:30",
        "2019-07-03T00:00:00 05:30",
        "2019-07-04T00:00:00 05:30",
        "2019-07-04T00:00:00 05:30",
        "2019-07-05T00:00:00 05:30"]})

df["date"] = df["date"].apply(lambda x: datetime.datetime.fromisoformat(x).date())
print(df)
  • Related