I have a table like below, using this dummy data:
data = [['Jane', 10,10.5,11,13,45,41,66,21,88,99,77,84,66,8,77,22,11,44,69,85,36,4,
87,74,56,88,23,6,9,8,55,12,4,58,36,44,89,81,7,98,52,11,45,87,96,32,58,76],
['John',11,22,55,23,6,9,8,41,12,4,58,66,99,36,44,89,81,7,98,52,33,11,45,87,
96,32,58,55,12,4,58,13,36,52,89,81,7,98,34,44,11,25,22,44,55,88,56,47],
['Tom',23,32,43,12,10,10.5,11,46,45,45,46,66,21,88,99,77,8,77,22,11,44,69,8511,44,77,85,99,45,63,
89,81,7,98,3,52,89,81,7,98,13,58,36,44,89,81,7,98,52]]
df = pd.DataFrame(data, columns = ['Name','January-19','February-19','March-19','April-19','May-19','June-19',
'July-19','August-19', 'September-19', 'October-19','November-19', 'December-19',
'January-20','February-20','March-20','April-20','May-20','June-20',
'July-20','August-20', 'September-20', 'October-20','November-20','December-20',
'January-21','February-21','March-21','April-21','May-21','June-21',
'July-21','August-21', 'September-21', 'October-21','November-21',
'December-21','Q1 2019',
'Q2 2019', 'Q3 2019', 'Q4 2019','Q1 2020',
'Q2 2020', 'Q3 2020', 'Q4 2020','Q1 2021',
'Q2 2021', 'Q3 2021', 'Q4 2021'])
This results in the follow dataframe:
Each entry indicates a growth rate.
I need to calculate 2Y and 3Y stacked growth rates using a specific formula for each month and quarter in 2021.
Formula example:
3Y Jan 21 = ((1 Jan 2021 rate) * (1 Jan 2020 rate) * (1 Jan 2019 rate)-1
2Y Jan 21 = ((1 Jan 2021 rate) * (1 Jan 2020 rate))-1
I had initially done the following, which was to transpose the data and try to use the percentage change method in Pandas, indicating a time period of 24 months for the 2 year stacked column:
test1 = df.T
new_header = test1.iloc[0] #grab the first row for the header
test1 = test1[1:] #take the data less the header row
test1.columns = new_header #
test1 = test1.reset_index()
test1.rename(columns={'index': 'Date'}, inplace = True)
test1['Jane_2Y_Growth'] = test1['Jane'].pct_change(24)
test1.tail(15)
However I don't think its correct because I get a different answer when I manually do the formula.
Does anyone know how to apply this formula to all of my periods in 2021? 12 months and 4 quarters.
So, the 2Y stacked growth column for Q4 2021 for example, would use the Q4 figures from 2021 and 2020.
Thanks
CodePudding user response:
Based on the formula example you have given :
3Y Jan 21 = ((1 Jan 2021 rate) * (1 Jan 2020 rate) * (1 Jan 2019 rate)-1
2Y Jan 21 = ((1 Jan 2021 rate) * (1 Jan 2020 rate))-1
Transposing and filtering values of interest from dataframe
NDF = df.T
NDF, NDF.columns = NDF[1:] , NDF.iloc[0]
result_df = NDF.loc[[x for x in NDF.index if '21' in x ]]
Creating placeholder columns for the resultant dataframe
for col in NDF.columns:
result_df[col '_2Y Growth'] = 0
result_df[col '_3Y Growth'] = 0
The above gives us
result_df
Name Jane John Tom ... John_3Y Growth Tom_2Y Growth Tom_3Y Growth
January-21 56 96 77 ... 0 0 0
February-21 88 32 85 ... 0 0 0
March-21 23 58 99 ... 0 0 0
April-21 6 55 45 ... 0 0 0
May-21 9 12 63 ... 0 0 0
June-21 8 4 89 ... 0 0 0
July-21 55 58 81 ... 0 0 0
August-21 12 13 7 ... 0 0 0
September-21 4 36 98 ... 0 0 0
October-21 58 52 3 ... 0 0 0
November-21 36 89 52 ... 0 0 0
December-21 44 81 89 ... 0 0 0
Q1 2021 96 55 81 ... 0 0 0
Q2 2021 32 88 7 ... 0 0 0
Q3 2021 58 56 98 ... 0 0 0
Q4 2021 76 47 52 ... 0 0 0
Applying the formula :
for idx in NDF.index:
if('21' in idx):
idx20 = idx[:-2] '20'
idx19 = idx[:-2] '19'
for col in NDF.columns:
result_df.loc[idx, col '_3Y Growth'] = (1 NDF.loc[idx, col]) * (1 NDF.loc[idx20, col]) * (1 NDF.loc[idx19, col]) - 1
result_df.loc[idx, col '_2Y Growth'] = (1 NDF.loc[idx, col]) * (1 NDF.loc[idx20, col]) - 1
Output :
This gives us the expected output based on the formula
Jane | John | Tom | Jane_2Y Growth | Jane_3Y Growth | John_2Y Growth | John_3Y Growth | Tom_2Y Growth | Tom_3Y Growth | |
---|---|---|---|---|---|---|---|---|---|
January-21 | 56 | 96 | 77 | 3818 | 42008 | 9699 | 116399 | 1715 | 41183 |
February-21 | 88 | 32 | 85 | 800 | 9210.5 | 1220 | 28082 | 7653 | 252581 |
March-21 | 23 | 58 | 99 | 1871 | 22463 | 2654 | 148679 | 9999 | 439999 |
April-21 | 6 | 55 | 45 | 160 | 2253 | 5039 | 120959 | 3587 | 46643 |
May-21 | 9 | 12 | 63 | 119 | 5519 | 1065 | 7461 | 575 | 6335 |
June-21 | 8 | 4 | 89 | 404 | 17009 | 39 | 399 | 7019 | 80729 |
July-21 | 55 | 58 | 81 | 3919 | 262639 | 5840 | 52568 | 1885 | 22631 |
August-21 | 12 | 13 | 7 | 1117 | 24595 | 741 | 31163 | 95 | 4511 |
September-21 | 4 | 36 | 98 | 184 | 16464 | 1257 | 16353 | 4454 | 204929 |
October-21 | 58 | 52 | 3 | 294 | 29499 | 635 | 3179 | 279 | 12879 |
November-21 | 36 | 89 | 52 | 3255 | 253967 | 4139 | 244259 | 451135 | 21203391 |
December-21 | 44 | 81 | 89 | 3374 | 286874 | 7215 | 483471 | 4049 | 271349 |
Q1 2021 | 96 | 55 | 81 | 5140 | 462689 | 671 | 5375 | 4837 | 396715 |
Q2 2021 | 32 | 88 | 7 | 395 | 32471 | 2313 | 229085 | 295 | 2367 |
Q3 2021 | 58 | 56 | 98 | 2713 | 21711 | 1310 | 45884 | 4454 | 441044 |
Q4 2021 | 76 | 47 | 52 | 6775 | 670823 | 2159 | 97199 | 4769 | 66779 |
EDIT : As OP asked,
Is there a way to add the new columns to the bottom instead and instead of labelling them 'Jane_2Y Growth' for example, label then 'January-21 2Y growth'. This way they are added to the index and do not have any names in them, just the date and the growth period.
Instead of adding the result to results
dataframe we can add it to NDF
datafrmae in the appropriate cell. The result would be stored in NDF
for idx in NDF.index:
if('21' in idx):
idx20 = idx[:-2] '20'
idx19 = idx[:-2] '19'
for col in NDF.columns:
NDF.loc[idx ' 3Y Growth', col] = (1 NDF.loc[idx, col]) * (1 NDF.loc[idx20, col]) * (1 NDF.loc[idx19, col]) - 1
NDF.loc[idx ' 2Y Growth', col] = (1 NDF.loc[idx, col]) * (1 NDF.loc[idx20, col]) - 1
This gives us the expected output :
Jane | John | Tom | |
---|---|---|---|
January-19 | 10 | 11 | 23 |
February-19 | 10.5 | 22 | 32 |
March-19 | 11 | 55 | 43 |
April-19 | 13 | 23 | 12 |
May-19 | 45 | 6 | 10 |
June-19 | 41 | 9 | 10.5 |
July-19 | 66 | 8 | 11 |
August-19 | 21 | 41 | 46 |
September-19 | 88 | 12 | 45 |
October-19 | 99 | 4 | 45 |
November-19 | 77 | 58 | 46 |
December-19 | 84 | 66 | 66 |
January-20 | 66 | 99 | 21 |
February-20 | 8 | 36 | 88 |
March-20 | 77 | 44 | 99 |
April-20 | 22 | 89 | 77 |
May-20 | 11 | 81 | 8 |
June-20 | 44 | 7 | 77 |
July-20 | 69 | 98 | 22 |
August-20 | 85 | 52 | 11 |
September-20 | 36 | 33 | 44 |
October-20 | 4 | 11 | 69 |
November-20 | 87 | 45 | 8511 |
December-20 | 74 | 87 | 44 |
January-21 | 56 | 96 | 77 |
February-21 | 88 | 32 | 85 |
March-21 | 23 | 58 | 99 |
April-21 | 6 | 55 | 45 |
May-21 | 9 | 12 | 63 |
June-21 | 8 | 4 | 89 |
July-21 | 55 | 58 | 81 |
August-21 | 12 | 13 | 7 |
September-21 | 4 | 36 | 98 |
October-21 | 58 | 52 | 3 |
November-21 | 36 | 89 | 52 |
December-21 | 44 | 81 | 89 |
Q1 2019 | 89 | 7 | 81 |
Q2 2019 | 81 | 98 | 7 |
Q3 2019 | 7 | 34 | 98 |
Q4 2019 | 98 | 44 | 13 |
Q1 2020 | 52 | 11 | 58 |
Q2 2020 | 11 | 25 | 36 |
Q3 2020 | 45 | 22 | 44 |
Q4 2020 | 87 | 44 | 89 |
Q1 2021 | 96 | 55 | 81 |
Q2 2021 | 32 | 88 | 7 |
Q3 2021 | 58 | 56 | 98 |
Q4 2021 | 76 | 47 | 52 |
January-21 3Y Growth | 42008 | 116399 | 41183 |
January-21 2Y Growth | 3818 | 9699 | 1715 |
February-21 3Y Growth | 9210.5 | 28082 | 252581 |
February-21 2Y Growth | 800 | 1220 | 7653 |
March-21 3Y Growth | 22463 | 148679 | 439999 |
March-21 2Y Growth | 1871 | 2654 | 9999 |
April-21 3Y Growth | 2253 | 120959 | 46643 |
April-21 2Y Growth | 160 | 5039 | 3587 |
May-21 3Y Growth | 5519 | 7461 | 6335 |
May-21 2Y Growth | 119 | 1065 | 575 |
June-21 3Y Growth | 17009 | 399 | 80729 |
June-21 2Y Growth | 404 | 39 | 7019 |
July-21 3Y Growth | 262639 | 52568 | 22631 |
July-21 2Y Growth | 3919 | 5840 | 1885 |
August-21 3Y Growth | 24595 | 31163 | 4511 |
August-21 2Y Growth | 1117 | 741 | 95 |
September-21 3Y Growth | 16464 | 16353 | 204929 |
September-21 2Y Growth | 184 | 1257 | 4454 |
October-21 3Y Growth | 29499 | 3179 | 12879 |
October-21 2Y Growth | 294 | 635 | 279 |
November-21 3Y Growth | 253967 | 244259 | 2.12034e 07 |
November-21 2Y Growth | 3255 | 4139 | 451135 |
December-21 3Y Growth | 286874 | 483471 | 271349 |
December-21 2Y Growth | 3374 | 7215 | 4049 |
Q1 2021 3Y Growth | 462689 | 5375 | 396715 |
Q1 2021 2Y Growth | 5140 | 671 | 4837 |
Q2 2021 3Y Growth | 32471 | 229085 | 2367 |
Q2 2021 2Y Growth | 395 | 2313 | 295 |
Q3 2021 3Y Growth | 21711 | 45884 | 441044 |
Q3 2021 2Y Growth | 2713 | 1310 | 4454 |
Q4 2021 3Y Growth | 670823 | 97199 | 66779 |
Q4 2021 2Y Growth | 6775 | 2159 | 4769 |