Create conditional dataframe column using content of a different column as input on f-string inside-CodePudding

I have a dataframe like this (simplified):

|    | amount  | other_amt |  rule_id |
|---:|:--------|:----------|---------:|
|  0 | 2       | 0         |      101 |
|  1 | 20      | 0.5       |      102 |
|  2 | 300     | 0         |        0 |
|  3 | 50      | 1         |      101 |

I then have a set of functions that apply each of these rules to the data, such as:

def rule_101(df):
    
    return df['amount'] / 2

def rule_102(df):
    
    return df['other_amt']

I want to create a new column where I apply each rule_xxx(df) function, depending on what's in the rule_id column. And I use the content of the rule_id column to call the function within the command that creates the new column. Something like

df['new_col'] = np.where(df['rule_id'] == '0',
                         df['amount']),
                         locals()[f'rule_{df.rule_id}'](df))

This bit f'rule_{df.rule_id}' is what's causing me trouble. It returns the full series and thus an error, like

KeyError: 'rule_0      0\n1      0\n2      0\n3      0\n4      0\n      ..\n495    0\n496    0\n497    0\n498    0\n499    0\nName: rule_id, Length: 500, dtype: object'

How can I "align" these two inputs? So that the value in rule_id for each row gets inserted in the f-string, thus calling the function for that specific rule_id on that specific row?

Other approaches are also welcome of course, as long as I'm able to apply the function corresponding to the rule_id in each row. Thanks a lot

CodePudding user response：

You can use a dictionary to look up the rules:

def rule_101(df):
    return df['amount'] / 2

def rule_102(df):
    return df['other_amt']

ruleset = {
      0: lambda k: 0,
    101: rule_101,
    102: rule_102
}

def rules(row):
    return ruleset[row['rule_id']](row)

df['new_col'] = df.apply(rules, axis=1)