Home > Back-end >  How to calculate custom fiscal year in pandas?
How to calculate custom fiscal year in pandas?

Time:03-03

I have a dataframe like as shown below

app_date
20/3/2017
28/8/2017
18/10/2017
15/2/2017
2/5/2017
11/9/2016

df = pd.read_clipboard()

Our company fiscal year is from October of current year to September of next year

Q1 - Oct to Dec
Q2 - Jan to Mar
Q3 - Apr to Jun
Q4 - July - Sep  

I was trying something like below

tf['app_date'] = pd.to_datetime(tf['app_date'])
tf['act_month'] = pd.DatetimeIndex(tf['app_date']).month
tf['act_year'] = pd.DatetimeIndex(tf['app_date']).year
tf['act_qtr'] = tf['app_date'].dt.to_period('Q').dt.strftime('Q%q')
tf['comp_fis_year'] = np.where(tf['act_month'] >= 9,tf['act_year'] 1,tf['act_year'])
tf['comp_fis_qtr'] = tf['app_date'].dt.to_period('Q').add(1).dt.strftime('Q%q') #thanks to jezrael for this trick to get quarter

Is there any elegant and efficient way to do the above? Mainly for calculating the fiscal year based on our financial year (Oct to Sep)?

I expect my output to be like as shown below

enter image description here

CodePudding user response:

IIUC, use to_period with a custom frequency:

pd.to_datetime(df['app_date'], dayfirst=True).dt.to_period('Q-SEP')

output:

0    2017Q2
1    2017Q4
2    2018Q1
3    2017Q2
4    2017Q3
5    2016Q4
Name: app_date, dtype: period[Q-SEP]

for separate columns:

s = pd.to_datetime(df['app_date'], dayfirst=True).dt.to_period('Q-SEP')
s.astype(str).str.extract('(?P<year>\d )(?P<quarter>Q\d )')

output:

   year quarter
0  2017      Q2
1  2017      Q4
2  2018      Q1
3  2017      Q2
4  2017      Q3
5  2016      Q4

Q start/end:

df['Q'] = pd.to_datetime(df['app_date'], dayfirst=True).dt.to_period('Q-SEP')
df['Qstart'] = df['Q'].dt.asfreq('D', 's')
df['Qend'] = df['Q'].dt.asfreq('D', 'e')

output:

     app_date       Q      Qstart        Qend
0   20/3/2017  2017Q2  2017-01-01  2017-03-31
1   28/8/2017  2017Q4  2017-07-01  2017-09-30
2  18/10/2017  2018Q1  2017-10-01  2017-12-31
3   15/2/2017  2017Q2  2017-01-01  2017-03-31
4    2/5/2017  2017Q3  2017-04-01  2017-06-30
5   11/9/2016  2016Q4  2016-07-01  2016-09-30

CodePudding user response:

You can use:

# Create 2 DatetimeIndex instead of a Series (avoid using .dt accessor)
start = pd.to_datetime(df['app_date'].values, dayfirst=False)
end = start   pd.DateOffset(months=3)

cols = ['act_month', 'act_year', 'act_qtr', 'comp_fis_year', 'comp_fis_qtr']
df = df.join(pd.DataFrame([start.month, start.year, 'Q'   start.quarter.astype(str),
                           end.year, 'Q'   end.quarter.astype(str)],
                           index=cols).T)

Output:

>>> df
     app_date act_month act_year act_qtr comp_fis_year comp_fis_qtr
0   3/20/2017         3     2017      Q1          2017           Q2
1   8/28/2017         8     2017      Q3          2017           Q4
2  10/18/2017        10     2017      Q4          2018           Q1
3   2/15/2017         2     2017      Q1          2017           Q2
4    2/5/2017         2     2017      Q1          2017           Q2
5   11/9/2016        11     2016      Q4          2017           Q1

Setup:

app_date
3/20/2017
8/28/2017
10/18/2017
2/15/2017
2/5/2017
11/9/2016

df = pd.read_clipboard()
  • Related