I have a dataset where, whenever a date value in the Update
column is entered into the input() prompt, Date1, Date2 and Date3
values will be updated according to these rules:
Date1 is 5 months from the date entered in the input() prompt
Date2 is 2 months from the date entered in the input() prompt
Date3 is 1 month from the date entered in the input() prompt
The only data that is changing are the dates, which are essentially getting shifted based upon the user input.
Data
ID Stat Date1 Date2 Date3 Update
aa 500 1/1/2021 4/1/2021 5/1/2021 6/1/2021
bb 800 1/1/2021 4/1/2021 5/1/2021 6/1/2021
Desired
Input prompt will ask user which date value they wish to input. User inputs the date '8/1/2021', which updates the remaining column date values. based on rules above.
ID Stat Date1 Date2 Date3 Update
aa 500 3/1/2021 6/1/2021 7/1/2021 8/1/2021
bb 800 3/1/2021 6/1/2021 7/1/2021 8/1/2021
Doing
I believe I can use a combination of a function as well as user prompt to approach this problem.
Type input
datevalue = pd.to_datetime(input("Enter date value: "))
print(datevalue)
Use input variable in function or script to create date update for Date1, Date2 and Date3
df[0] = df[1].apply(lambda x: datevalue - pd.DateOffset(months=x))
s = df['Update'].str.replace(r'(\S ) (\S )', r'\2\1')
df['Update'] = (pd.PeriodIndex(s, freq='D') 3).strftime('D%q %Y')
I am looking for some starting point suggestion or a good foundation/documentation on how to best approach this problem. I am still researching. Any suggestion is appreciated.
CodePudding user response:
We can define a mapping dictionary which maps the column names to corresponding dateoffset, then create a series from this dictionary and subtract the datevalue
from offsets, finally assign
the updated date values back to dataframe
d = {
'Date1': pd.DateOffset(months=5),
'Date2': pd.DateOffset(months=2),
'Date3': pd.DateOffset(months=1),
}
s = pd.Series(d).rsub(datevalue)
df.assign(**{**s, 'Update': datevalue})
ID Stat Date1 Date2 Date3 Update
0 aa 500 2021-03-01 2021-06-01 2021-07-01 2021-08-01
1 bb 800 2021-03-01 2021-06-01 2021-07-01 2021-08-01
CodePudding user response:
I think you're looking for something like this:
cols = df.filter(like='Date').columns.tolist() ['Update']
df[cols] = df[cols].apply(lambda col: pd.to_datetime(col) (datevalue - df['Update']))
Output:
>>> df
ID Stat Date1 Date2 Date3 Update
0 aa 500 2021-03-03 2021-06-01 2021-07-01 2021-08-01
1 bb 800 2021-03-03 2021-06-01 2021-07-01 2021-08-01