I have a time series, month level data as shown below. df:
date
0 1997-01-01 00:00:00
1 1997-02-02 00:00:00
2 1997-03-03 00:00:00
3 1997-04-02 00:00:00
4 1997-05-02 00:00:00
5 1997-06-01 00:00:00
6 1997-07-01 00:00:00
7 1997-08-31 00:00:00
8 1997-09-30 00:00:00
9 1997-10-31 00:00:00
10 1997-11-30 00:00:00
11 1997-12-31 00:00:00
12 1998-01-01 00:00:00
13 1998-02-28 00:00:00
14 1998-03-31 00:00:00
from the above data I would like to extract month level precision and make all the day to 01
Expected output:
date
0 1997-01-01
1 1997-02-01
2 1997-03-01
3 1997-04-01
4 1997-05-01
5 1997-06-01
6 1997-07-01
7 1997-08-01
8 1997-09-01
9 1997-10-01
10 1997-11-01
11 1997-12-01
12 1998-01-01
13 1998-02-01
14 1998-03-01
I tried below code:
df['date'] = pd.to_datetime(df['date'], format='%Y-%m')
But it did not replace the dates as shown below
1997-02-02
to 1997-02-01
1997-04-02
to 1997-04-01
1997-12-31
to 1997-12-01
1998-02-28
to 1998-02-01
CodePudding user response:
Use:
df['date'] = pd.to_datetime(df['date']) pd.offsets.DateOffset(day=1)
print (df)
date
0 1997-01-01
1 1997-02-01
2 1997-03-01
3 1997-04-01
4 1997-05-01
5 1997-06-01
6 1997-07-01
7 1997-08-01
8 1997-09-01
9 1997-10-01
10 1997-11-01
11 1997-12-01
12 1998-01-01
13 1998-02-01
14 1998-03-01
Or:
d = pd.to_datetime(df['date'])
df['date'] = d - pd.to_timedelta(d.dt.day - 1, 'd')
print (df)
date
0 1997-01-01
1 1997-02-01
2 1997-03-01
3 1997-04-01
4 1997-05-01
5 1997-06-01
6 1997-07-01
7 1997-08-01
8 1997-09-01
9 1997-10-01
10 1997-11-01
11 1997-12-01
12 1998-01-01
13 1998-02-01
14 1998-03-01