I have multiple data-frames where I need to loop start and end dates. The end dates are fixed; however, the start dates would varies between 1 to 10 years. For the sample below, all start dates are 6 years prior to the end dates. Therefore, is there a way to coding it where all k[i] are 6 years prior to l[i] instead of manually entering them for each df? I've tried using l[i] - 6, however, i got an error.
start dates:
k = {}
k[0]='2009-08-31'
k[1]='2010-08-31'
k[2]='2011-08-31'
end date:
l = {}
l[0]='2015-07-31'
l[1]='2016-07-31'
l[2]='2017-07-31'
t_l = len(l)
MAX_NUMBER = t_l
for i in range(MAX_NUMBER):
df = df0.copy()
tr_sta = k[i]
tr_end = l[i]
df = df[(df['date'] > tr_sta)]
df = df[(df['date'] < tr_end)]
thanks in advance
CodePudding user response:
Let's say you first have a dataframe only with end dates
df = pd.DataFrame({'end date': [val for key, val in l.items()]})
Then, define start dates by using pd.offsets.DateOffset
:
df['end date'] = pd.to_datetime(df['end date'], format='%Y-%m-%d')
df['start date'] = df['end date'] pd.offsets.DateOffset(years=6)
By the way, your start dates are actually 6 years prior and 1 month forward of the end dates. So you may also pass
df['start date'] = df['end date'] pd.offsets.DateOffset(years=6, months=1)