I am thinking of trying if condition, but is there any library or method which I don't know about can solve this?
CodePudding user response:
You can subtract actual month-year period with months from values with decimals with month(s)
and assign back to DataFrames, for days convert values to timedeltas and subtract actual datetime by Series.rsub
for subtract from right side:
print (df)
col
0 28 days ago
1 4 months ago
2 11 months ago
3 Oct, 2021
now = pd.Timestamp('now')
per = now.to_period('m')
date = now.floor('d')
s = df['col'].str.extract('(\d )\s*month', expand=False).astype(float)
s1 = df['col'].str.extract('(\d )\s*day', expand=False).astype(float)
mask, mask1 = s.notna(), s1.notna()
df.loc[mask, 'col'] = s[mask].astype(int).rsub(per).dt.strftime('%b, %Y')
df.loc[mask1, 'col'] = pd.to_timedelta(s1[mask1], unit='d').rsub(date).dt.strftime('%b, %Y')
print (df)
col
0 Sep, 2022
1 Jun, 2022
2 Nov, 2021
3 Oct, 2021
CodePudding user response:
Assuming this input:
col
0 4 months ago
1 Oct, 2021
2 9 months ago
You can use:
# try to get a date:
s = pd.to_datetime(df['col'], errors='coerce')
# extract the month offset
offset = (df['col']
.str.extract(r'(\d ) months? ago', expand=False)
.fillna(0).astype(int)
)
# if the date it NaT, replace by today - n months
df['date'] = s.fillna(pd.Timestamp('today').normalize()
- offset*pd.DateOffset(months=1))
If you want a Mon, Year format:
df['date2'] = df['col'].where(offset.eq(0),
(pd.Timestamp('today').normalize()
-offset*pd.DateOffset(months=1)
).dt.strftime('%b, %Y')
)
output:
col date date2
0 4 months ago 2022-06-28 Jun, 2022
1 Oct, 2021 2021-10-01 Oct, 2021
2 9 months ago 2022-01-28 Jan, 2022