Applying a function which requires other values from a column?-CodePudding

I am trying to create a very simple model that checks the compares energy prices for a given week with the week prior to that. So I am applying a function throughout a panda dataframe. What I am struggling with is getting the earlier value in a column. I have shifted the index to another column (called Counter) so I can just use that to minus by the offset. I am able to use that to get the desired position but I cannot go from that int to the value in the column. I understand that the first values will generate an error (plus some missing values in the original data) so I am using try/except and I have gathered additional data so that it is not an issue to get some NaNs in the beginning.

def u(df):
if df['Weekday'] in {0,1,4,5,6}: # For Mon, Tue, Thur, Fri, Sat & Sun one week shift
    offset=168
else:
    offset=24
position=df['Counter']-offset    
try:
    
    oldprice=df.iat[position,'Price'] #Never works, always leads to exception
    
except:
    oldprice = np.nan

try:
    olddemand =df.iat[position,'Demand']
except:
    olddemand = np.nan

print(oldprice)
newdemand = df['Demand']
currentprice =df['Price']
expprice=oldprice*(olddemand/newdemand)
u=currentprice-expprice
return u


results=df2.apply(u,axis=1)

The problem is that the try never works, I get NaNs (I also tried by setting the exception to 1000 and I get high values) across the board. The Counter seems to be working fine. I printed it earlier and it behaved as expected and was an int. I have also tried .at but no success. Thank you for your time.

CodePudding user response：

Currently, with axis=1, the parameter your function u() uses is not the whole DataFrame, but a single row, so you cannot access the other rows. A solution would be to pass the whole DataFrame as a parameter. Your function u() would become :

def u(row, df):
    if row['Weekday'] in {0, 1, 4, 5, 6}:  # For Mon, Tue, Thur, Fri, Sat & Sun one week shift
        offset = 168
    else:
        offset = 24

    position = row['Counter'] - offset
    try:

        oldprice = df.loc[position]['Price']

    except:
        oldprice = np.nan

    try:
        olddemand = df.loc[position]['Demand']
    except:
        olddemand = np.nan

    print(oldprice)
    newdemand = row['Demand']
    currentprice = row['Price']
    expprice = oldprice * (olddemand / newdemand)
    u = currentprice - expprice
    return u

And in order to pass the dataframe you need to add the arg :

results = df2.apply(u, args=(df2,), axis=1)

Note that without a sample of your data, I could not make sure that the output is correct.