Home > Blockchain >  How do I use .apply(func) with a conditional function (pandas)
How do I use .apply(func) with a conditional function (pandas)

Time:11-14

I have the following Pd series

    count   area    volume      formula     quantity
0   1.0     22       NaN       count         1.0
1   1.0     15       NaN       count         1.0
2   1.0     1.4      NaN       area          1.4
3   1.0     0.6       10       volume        100

The quantity column is based on the value in the formula column via a lookup e.g. row(0) is "count" so it is 1, row(2) is "area" so it's 1.4

For this I have a the following formula

Merged['quantity']=Merged.apply(lambda x: x[x['QuantityFormula']] , axis=1)

However quantity for volume is a calculated field: volume * 10. I've written a function to calculate both

def func(x):
    if x[x['QuantityFormula']] == Volume:
        return volume * 10
    else:
        return x[x['QuantityFormula']]
   
    
df['Classification'] = Merged['QuantityFormula'].apply(func)

However I get the following error

Error: string indices must be integers

Any ideas? Thanks

Answer

def func(row):
    if row['QuantityFormula'] == 'Volume':
        return row['Volume'] * 10
    return row[row['quantity']]

Merged['Ans'] = Merged.apply(func, axis=1)

CodePudding user response:

You can try something like this:

df.apply(lambda x: x['volume']*10 if x['formula'] == 'volume' else x['quantity'], axis=1)

print(df)

   count  area  volume formula  quantity    ans
0    1.0  22.0     NaN   count       1.0    1.0
1    1.0  15.0     NaN   count       1.0    1.0
2    1.0   1.4     NaN    area       1.4    1.4
3    1.0   0.6    10.0  volume     100.0  100.0

Using an explicit function, you can do:

def func(row):
    if row['formula'] == 'volume':
        return row['volume'] * 10
    return row['quantity']

df.apply(func, axis=1)

CodePudding user response:

Use a lookup:

import numpy as np

s = df['formula'].str.lower()
m = s.eq('volume')

idx, cols = pd.factorize(s)

df['quantity'] = (df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
                  *  np.where(m, 10, 1)
                  )
  • Related