Home > Software design >  Calculating trend per customer level in Python dataframe
Calculating trend per customer level in Python dataframe

Time:02-08

I have 4 columns in dataset which are cid(customer level), month, spending and transaction (max.cid=10000). As seen below, df.head().

    cid month   spending    transaction
0   1   3         61.94        28
1   1   4         73.02        23
2   1   7         59.34        25
3   1   8         48.69        24
4   1   9         121.79       26

I use the following function to calculate the trend (slope)in the outflow spending per customer. However, I get the identical number as a result for the whole dataset. Expected to calculate trend of spendings on customer level. (trend value for each customer).

Is there a way to iterate over each customer level in the dataset and obtain individual trends per customer? Thanks in advance!

df = pd.read_csv("/content/case_data.csv")

import numpy as np

def trendline(df, order=1):

    coeffs = np.polyfit(df.index.values, list(df), order)
    slope = coeffs[-2]
    return float(slope)

outflow = df['spending'] 
cid = df['cid']

df_ = pd.DataFrame({'cid': cid, 'outflow': outflow})
slope_outflow = trendline(df_['cid'])
slope_outflow

Output : 0.13377820413729283

Expected Output: (Trend1), (Trend2), (Trend3), ......, (Trend10000)

CodePudding user response:

def trendline(x, y, order=1):
    return np.polyfit(x, y, order)[-2]

df.groupby('cid').apply(lambda subdf: trendline(subdf['month'].values, subdf['spending'].values))

You can use groupby to calculate the trend by each cid value. In the above example it is for the trend of spending.

  •  Tags:  
  • Related