I have a data frame, you can have it by running:
import pandas as pd
from io import StringIO
df = """
case_id scheduled_date code
1213 2021-08-17 1
3444 2021-06-24 3
4566 2021-07-20 5
"""
df= pd.read_csv(StringIO(df.strip()), sep='\s\s ', engine='python')
How can I change scheduled_date
to only keep year and month? The output should be:
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
CodePudding user response:
You can also try this:
df['scheduled_date'] = pd.to_datetime(df.scheduled_date, format='%Y-%m-%d').dt.strftime('%Y-%m')
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
CodePudding user response:
You can use string parsing to drop the day of the month (I'm assuming you want strings since the days in the expected output are absent):
df["scheduled_date"].str.split("-").str[:2].str.join("-").astype(str)
This outputs:
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
CodePudding user response:
Convert the date to datetime and access the month that way
df['month'] = pd.to_datetime(df['scheduled_date']).dt.to_period('M')
case_id scheduled_date code month
0 1213 2021-08-17 1 2021-08
1 3444 2021-06-24 3 2021-06
2 4566 2021-07-20 5 2021-07
Note that the dtype with be period[M]
and not an object using this method.
CodePudding user response:
Firstly, convert your string column to datetime column. Later you can apply many different date operations.