Home > Enterprise >  How to write a conditional statement based on combination of two columns and a dictionary, using the
How to write a conditional statement based on combination of two columns and a dictionary, using the

Time:11-21

I am working with a pandas dataframe (the dataframe is called market_info_df):

enter image description here

And I have the following Python code:

market_info_df['is_and_mp'] = market_info_df['issue_status']   market_info_df['market_phase']

no_collision_issue_status = ['000', '200', '203', '204', '300']

MARKET_STATES_DICT = {
('000', '  '): MARKET_STATES.CLOSED,
('100', '  ', 'F'): MARKET_STATES.OPENING_AUCTION,
('200', '  '): MARKET_STATES.CONTINUOUS_TRADING,
('203', '  '): MARKET_STATES.UNSCHEDULED_AUCTION,
('204', '  '): MARKET_STATES.UNSCHEDULED_AUCTION,
('100', 'B0'): MARKET_STATES.UNSCHEDULED_AUCTION,
('200', 'B1'): MARKET_STATES.CONTINUOUS_TRADING,
('400', 'C0'): MARKET_STATES.HALTED,
('400', 'C1'): MARKET_STATES.CONTINUOUS_TRADING,
('400', 'D0'): MARKET_STATES.HALTED,
('400', 'D1'): MARKET_STATES.POST_TRADE}

I am trying to write a condition such that if the is_and_mp is in the no_collision_issue_status list, OR the trading_status is not , then use the MARKET_STATES_DICT to map a new column called market_state.

Here is what I have written, but I get an error TypeError: unhashable type: 'Series':

market_info_df.loc[(market_info_df['is_and_mp'] in no_collision_issue_status) | (~market_info_df['trading_state'] == '  '),
                           'market_state'] = MARKET_STATES_DICT[(market_info_df['issue_status'], market_info_df['trading_state'])]

I understand what is wrong and why I am getting the error, but I am not sure how to fix it!

CodePudding user response:

Use apply function on dataframe. Check for the desired condition as you have written. If true then return the value from dict else return None:

market_info_df["market_state"] = market_info_df.apply(lambda row: MARKET_STATES_DICT[(row["is_and_mp"],row["trading_status"])] if row["is_and_mp"] in no_collision_issue_status or row["trading_status"] != "  " else None, axis=1)

Full example with dummy data:

market_info_df = pd.DataFrame(data=[["10","0","B0"],["20","0","  "],["40","0","D1"]], columns=["issue_status", "market_phase", "trading_status"])

market_info_df['is_and_mp'] = market_info_df['issue_status']   market_info_df['market_phase']

no_collision_issue_status = ['000', '200', '203', '204', '300']

MARKET_STATES_DICT = {
('000', '  '): "CLOSED",
('100', '  ', 'F'): "OPENING_AUCTION",
('200', '  '): "CONTINUOUS_TRADING",
('203', '  '): "UNSCHEDULED_AUCTION",
('204', '  '): "UNSCHEDULED_AUCTION",
('100', 'B0'): "UNSCHEDULED_AUCTION",
('200', 'B1'): "CONTINUOUS_TRADING",
('400', 'C0'): "HALTED",
('400', 'C1'): "CONTINUOUS_TRADING",
('400', 'D0'): "HALTED",
('400', 'D1'): "POST_TRADE"}

market_info_df["market_state"] = market_info_df.apply(lambda row: MARKET_STATES_DICT[(row["is_and_mp"],row["trading_status"])] if row["is_and_mp"] in no_collision_issue_status or row["trading_status"] != "  " else None, axis=1)

[Out]:
  issue_status market_phase trading_status is_and_mp         market_state
0           10            0             B0       100  UNSCHEDULED_AUCTION
1           20            0                      200   CONTINUOUS_TRADING
2           40            0             D1       400           POST_TRADE
  • Related