Home > database >  How to get the growth between two rows
How to get the growth between two rows

Time:11-25

I'm trying to get the growth (in %) between two values at different period. Here is how my DataFrame looks like:

     sessionSource dateRange activeUsers
0    instagram.com   current           5
1    instagram.com  previous           0
2  l.instagram.com   current          83
3  l.instagram.com  previous          11
4     snapchat.com   current           2
5     snapchat.com  previous           1

What I'm trying to get is:

     sessionSource dateRange activeUsers  Growth
0    instagram.com   current           5     xx%
2  l.instagram.com   current          83     xx%
4     snapchat.com   current           2     xx%

I'm not a Pandas expert, I tried few things but nothing came close to what I need.

Thanks a lot for any help.

CodePudding user response:

Assuming you literally just need the percent change between current and previous and current/previous are in the correct order, you can just group the data based on the source and get the percent change of the group .Use the pandas.Series.pct_change() method on the grouped object and you should be good.

df['Growth']= (df.groupby('sessionSource')['activeUsers'].apply(pd.Series.pct_change))

For ex.(taken from the official documentation and applied on a series):

s = pd.Series([90, 91, 85])
s
0    90
1    91
2    85
dtype: int64

s.pct_change()
0         NaN
1    0.011111
2   -0.065934
dtype: float64

CodePudding user response:

You can use:

(df.sort_values(by=['sessionSource', 'dateRange'],
                ascending=[True, False])
   .groupby('sessionSource', as_index=False)
   .agg({'dateRange': 'first', 'activeUsers': lambda s: s.pct_change().dropna().mul(100)})
 )

Output:

     sessionSource dateRange  activeUsers
0    instagram.com  previous          inf
1  l.instagram.com  previous   654.545455
2     snapchat.com  previous   100.000000
  • Related