I have a dataset where I would like a certain statement to only be applied to specific rows in my dataset. I would like this to only be applied to rows that contain [type] == 'aa', and apply second statement for rows that contain [type] == 'bb'
Data
location type mig1 de mig2 re
ny aa 8/1/2021 10/1/2021 1/1/2022 2/1/2022
ny aa 8/1/2021 10/1/2021 1/1/2022 2/1/2022
ca aa 8/1/2021 10/1/2021 1/1/2022 2/1/2022
tx bb 9/1/2021 11/1/2021 2/1/2022 3/1/2022
Desired
#Date 8/1/2022 is input, which yields:
location type mig1 de mig2 re
ny aa 03/1/2022 05/1/2022 8/1/2022 7/1/2022
ny aa 03/1/2022 05/1/2022 8/1/2022 7/1/2022
ca aa 03/1/2022 05/1/2022 8/1/2022 7/1/2022
tx bb 03/1/2021 11/1/2021 08/1/2022 3/1/2022
Logic - we see that the shift applies to all columns when the type == aa
when the type value == bb, a different statement is applied
Doing
#takes input value
datevalue = pd.to_datetime(input("Enter shift: "))
#shifts dates from the datevalue input - However I would like d variable to only be applied to the rows that contain [type] == 'aa'
#apply e variable to rows that contain [type] = 'bb'
d = {
'mig1': pd.DateOffset(months=5),
'de': pd.DateOffset(months=3),
're': pd.DateOffset(months=1),
}
e = {
'mig1': pd.DateOffset(months=5),
}
s = pd.Series(d).rsub(datevalue)
df.assign(**{**s, 'mig2': datevalue})
Any suggestion is appreciated
CodePudding user response:
#Coerce dates to datetime
df1=df1.set_index(['location','type']).apply(lambda x: pd.to_datetime(x,format='%d/%m/%Y'))
#Set non dates as index, slice level two and impose the datetifference
df1.loc[ ( slice(None), 'aa' ), : ]=df1.loc[ ( slice(None), 'aa' ), : ]-pd.to_timedelta(5, unit='d')
mig1 de mig2 re
location type
ny aa 2021-01-03 2021-01-05 2021-12-27 2021-12-28
aa 2021-01-03 2021-01-05 2021-12-27 2021-12-28
ca aa 2021-01-03 2021-01-05 2021-12-27 2021-12-28
tx bb 2021-01-09 2021-01-11 2022-01-02 2022-01-03
CodePudding user response:
Try:
datevalue = pd.to_datetime(input("Enter shift: "), dayfirst=False) # 08/01/2022
new_values = pd.Series(d).rsub(datevalue).dt.strftime('%-m/%-d/%Y')
df.update(pd.DataFrame([new_values], index=df[df['type'] == 'aa'].index))
# do the same for type='bb' and 'e' dict
Output:
>>> df
location type mig1 de mig2 re
0 ny aa 3/1/2022 5/1/2022 1/1/2022 7/1/2022
1 ny aa 3/1/2022 5/1/2022 1/1/2022 7/1/2022
2 ca aa 3/1/2022 5/1/2022 1/1/2022 7/1/2022
3 tx bb 3/1/2022 11/1/2021 2/1/2022 3/1/2022