Home > Blockchain >  Apply a function to dataframe which includes previous row data
Apply a function to dataframe which includes previous row data

Time:01-26

I have an input dataframe for daily fruit spend which looks like this:

spend_df

Date        Apples      Pears      Grapes     
01/01/22      10         47          0
02/01/22      0          22          3
03/01/22      11         0           3
...

For each fruit, I need to apply a function using their respective parameters and inputs spends. The function includes the previous day and the current day spends, which is as follows:

y = beta(1 - exp(-(theta*previous current)/alpha))

parameters_df

Parameter    Apples      Pears      Grapes  
alpha         132         323        56
beta          424         31         33
theta         13          244        323

My output data frame should look like this (may contain errors):

profit_df

Date         Apples        Pears       Grapes     
01/01/22      30.93         4.19        0       
02/01/22      265.63        31.00       1.72
03/01/22      33.90         30.99       32.99
...     

This is what I attempted:

# First map parameters_df to spend_df
merged_df = input_df.merge(parameters_df, on=['Apples','Pears','Grapes'])

# Apply function to each row
profit_df = merged_df.apply(lambda x: beta(1 - exp(-(theta*x[-1]   x)/alpha))

CodePudding user response:

It might be easier to read if you extract the necessary variables from parameters_df and spend_df first. Then a simple application of the formula will produce the expected output.

# extract alpha, beta, theta from parameters df
alpha, beta, theta = parameters_df.iloc[:, 1:].values
# select fruit columns
current = spend_df[['Apples', 'Pears', 'Grapes']]
# find previous values of fruit columns
previous = current.shift(fill_value=0)

# calculate profit using formula
y = beta*(1 - np.exp(-(theta*previous   current) / alpha))
profit_df = spend_df[['Date']].join(y)

res

CodePudding user response:

Another approach using Pandas rolling function :

import pandas as pd
import numpy as np

sdf = pd.DataFrame({
  "Date": ['01/01/22', '02/01/22', '03/01/22'],
  "Apples": [10, 0, 11],
  "Pears": [47, 22, 0],
  "Grapes": [0, 3, 3],  
}).set_index("Date")

pdf = pd.DataFrame({
  "Parameter": ['alpha', 'beta', 'theta'],
  "Apples": [132, 424, 13],
  "Pears": [323, 31, 244],
  "Grapes": [56, 33, 323],  
}).set_index("Parameter")


def func(r):
    t = (pdf.loc['alpha', r.name], pdf.loc['beta', r.name], pdf.loc['theta', r.name])
    return r.rolling(2).apply(lambda x: t[1]*(1 - np.exp(-(t[2]*x[0]   x[1])/t[0])))

r1 = sdf.iloc[0:2,:].shift(fill_value=0).apply(lambda r: func(r), axis=0)
r = sdf.apply(lambda r: func(r), axis=0)

r.iloc[0]=r1.shift(-1).iloc[0]

print(r)

Result

              Apples      Pears     Grapes
Date                                      
01/01/22   30.934651   4.198004   0.000000
02/01/22  265.637775  31.000000   1.721338
03/01/22   33.901168  30.999998  32.999999
  • Related