Home > Mobile >  Get ratio of column values for different rows
Get ratio of column values for different rows

Time:12-16

I have a pandas dataframe like this:

Date Route Col1 Col2
01/01/2020 A 99 26
01/02/2020 A 37 96
01/03/2020 A 20 83
01/01/2021 A 50 79
01/02/2021 A 16 50
01/03/2021 A 52 33
01/01/2020 B 48 44
01/02/2020 B 96 30
01/03/2020 B 17 42
01/01/2021 B 34 74
01/02/2021 B 49 37
01/03/2021 B 70 12

I need to get the ratio for Col1 and Col2 based on 2020 and 2021 values. That is, a new column for each month that has Col2/Col1.

I have tried a few approaches (pct_change, diff), but I am unable to get the right result. I also made a DateTimeIndex to help me with this, but wasn't able to get the ratio.

This is the gist of what I am trying to do, but not sure how to frame the .loc command in this scenario:

df['Col1_ratio'] = df['Col2'] / df.loc(<get index of previous year's row>)['Col1']

This is the expected output dataframe:

Date Route Col1 Col2 Col1_ratio Col2_ratio
01/01/2020 A 99 26 NA NA
01/02/2020 A 37 96 NA NA
01/03/2020 A 20 83 NA NA
01/01/2021 A 50 79 0.505050505 3.038461538
01/02/2021 A 16 50 0.432432432 0.520833333
01/03/2021 A 52 33 2.6 0.397590361
01/01/2020 B 48 44 NA NA
01/02/2020 B 96 30 NA NA
01/03/2020 B 17 42 NA NA
01/01/2021 B 34 74 0.708333333 1.681818182
01/02/2021 B 49 37 0.510416667 1.233333333
01/03/2021 B 70 12 4.117647059 0.285714286

E.g. Col1_ratio is Col1_JAN21 / Col1_JAN20, and so on.

CodePudding user response:

Idea is create MultiIndex and then add one year for datetimes in rename, divide columns and add suffix by DataFrame.add_suffix, last add to original by DataFrame.join:

df1 = df.set_index(['Date','Route'])

df2 = df1.rename(lambda x: x   pd.offsets.DateOffset(years=1), level=0)

df3 = df1[['Col1','Col2']].div(df2[['Col1','Col2']]).add_suffix('_ratio')

df = df.join(df3, on=['Date','Route'])
print (df)
         Date Route  Col1  Col2  Col1_ratio  Col2_ratio
0  2020-01-01     A    99    26         NaN         NaN
1  2020-01-02     A    37    96         NaN         NaN
2  2020-01-03     A    20    83         NaN         NaN
3  2021-01-01     A    50    79    0.505051    3.038462
4  2021-01-02     A    16    50    0.432432    0.520833
5  2021-01-03     A    52    33    2.600000    0.397590
6  2020-01-01     B    48    44         NaN         NaN
7  2020-01-02     B    96    30         NaN         NaN
8  2020-01-03     B    17    42         NaN         NaN
9  2021-01-01     B    34    74    0.708333    1.681818
10 2021-01-02     B    49    37    0.510417    1.233333
11 2021-01-03     B    70    12    4.117647    0.285714
  • Related