How to remove T00:00:00 05:30 after year, month and date values in pandas? I tried converting the column into datetime but also it's showing the same results, I'm using pandas in streamlit. I tried the below code
df['Date'] = pd.to_datetime(df['Date'])
The output is same as below :
Date
2019-07-01T00:00:00 05:30
2019-07-01T00:00:00 05:30
2019-07-02T00:00:00 05:30
2019-07-02T00:00:00 05:30
2019-07-02T00:00:00 05:30
2019-07-03T00:00:00 05:30
2019-07-03T00:00:00 05:30
2019-07-04T00:00:00 05:30
2019-07-04T00:00:00 05:30
2019-07-05T00:00:00 05:30
Can anyone help me how to remove T00:00:00 05:30 from the above rows?
CodePudding user response:
Don't bother with apply
to Python dates or string changes. The former will leave you with an object type column and the latter is slow. Just round to the day frequency using the library function.
>>> pd.Series([pd.Timestamp('2000-01-05 12:01')]).dt.round('D')
0 2000-01-06
dtype: datetime64[ns]
If you have a timezone aware timestamp, convert to UTC with no time zone then round:
>>> pd.Series([pd.Timestamp('2019-07-01T00:00:00 05:30')]).dt.tz_convert(None) \
.dt.round('D')
0 2019-07-01
dtype: datetime64[ns]
CodePudding user response:
If I understand correctly, you want to keep only the date part.
Convert date strings to datetime
df = pd.DataFrame(
columns={'date'},
data=["2019-07-01T02:00:00 05:30", "2019-07-02T01:00:00 05:30"]
)
date
0 2019-07-01T02:00:00 05:30
1 2019-07-02T01:00:00 05:30
2 2019-07-03T03:00:00 05:30
df['date'] = pd.to_datetime(df['date'])
date
0 2019-07-01 02:00:00 05:30
1 2019-07-02 01:00:00 05:30
Remove the timezone
df['datetime'] = df['datetime'].dt.tz_localize(None)
date
0 2019-07-01 02:00:00
1 2019-07-02 01:00:00
Keep the date only
df['date'] = df['date'].dt.date
0 2019-07-01
1 2019-07-02
CodePudding user response:
Pandas doesn't have a builtin conversion to datetime.date
, but you could use .apply
to achieve this if you want to have date
objects instead of string:
import pandas as pd
import datetime
df = pd.DataFrame(
{"date": [
"2019-07-01T00:00:00 05:30",
"2019-07-01T00:00:00 05:30",
"2019-07-02T00:00:00 05:30",
"2019-07-02T00:00:00 05:30",
"2019-07-02T00:00:00 05:30",
"2019-07-03T00:00:00 05:30",
"2019-07-03T00:00:00 05:30",
"2019-07-04T00:00:00 05:30",
"2019-07-04T00:00:00 05:30",
"2019-07-05T00:00:00 05:30"]})
df["date"] = df["date"].apply(lambda x: datetime.datetime.fromisoformat(x).date())
print(df)