I have a dictionary of conditions called rules
, which I apply to a dataframe df
. Using numpy's select()
, I create a new column in df
with the dictionary keys where ever the first condition is True. The code is as follows:
import pandas as pd
import numpy as np
df = pd.DataFrame({'col1': [1, 2, 1, 3], 'col2': [4, 4, 4, 3]})
rules = {"Alert 1": df["col1"] == 1,
"Alert 2": df["col2"] == 4}
df['alert'] = np.select(rules.values(), rules.keys(), default = None)
df
Out[2]:
col1 col2 alert
0 1 4 Alert 1
1 2 4 Alert 2
2 1 4 Alert 1
3 3 3 None
I would like to change the dictionary rules
such that it consists of vectors that contain the original conditions plus a priority value. In addition to the dictionary key being written to df
, I would like this priority to be written as well. Modification to rules
, as well as my attempt to write both the dictionary key and priority to df
:
df = pd.DataFrame({'col1': [1, 2, 1, 3], 'col2': [4, 4, 4, 3]})
rules = {"Alert 1": [df["col1"] == 1, "High"],
"Alert 2": [df["col2"] == 4, "Medium"]}
df['alert'] = np.select(rules.values()[0], rules.keys(), default = None)
df['priority'] = np.select(rules.values()[0], rules.values()[1], default = None)
I get an error.
Ideally, I would like the output
col1 col2 alert priority
0 1 4 Alert 1 High
1 2 4 Alert 2 Medium
2 1 4 Alert 1 High
3 3 3 None None
Is there a way to accomplish this?
P.S. I need to keep the priority with the condition in the dictionary. I don't want a separate dictionary which maps the priority onto the dictionary key.
CodePudding user response:
If you wanted to stick with your current approach, you could use tuples to contain all of the values you need for each key. In this case you just need to pull the values at index 0 for the alert
and map the resulting values to index 1 for the priority
import pandas as pd
import numpy as np
df = pd.DataFrame({'col1': [1, 2, 1, 3], 'col2': [4, 4, 4, 3]})
rules = {"Alert 1": ([df["col1"] == 1, "High"]),
"Alert 2": ([df["col2"] == 4, "Medium"])}
df['alert'] = np.select([x[0] for x in rules.values()], rules.keys(), default = None)
df['priority'] = df['alert'].map({k:v[1] for k,v in rules.items()})
Output
col1 col2 alert priority
0 1 4 Alert 1 High
1 2 4 Alert 2 Medium
2 1 4 Alert 1 High
3 3 3 None NaN