I am working with a pandas dataframe (the dataframe is called market_info_df
):
And I have the following Python code:
market_info_df['is_and_mp'] = market_info_df['issue_status'] market_info_df['market_phase']
no_collision_issue_status = ['000', '200', '203', '204', '300']
MARKET_STATES_DICT = {
('000', ' '): MARKET_STATES.CLOSED,
('100', ' ', 'F'): MARKET_STATES.OPENING_AUCTION,
('200', ' '): MARKET_STATES.CONTINUOUS_TRADING,
('203', ' '): MARKET_STATES.UNSCHEDULED_AUCTION,
('204', ' '): MARKET_STATES.UNSCHEDULED_AUCTION,
('100', 'B0'): MARKET_STATES.UNSCHEDULED_AUCTION,
('200', 'B1'): MARKET_STATES.CONTINUOUS_TRADING,
('400', 'C0'): MARKET_STATES.HALTED,
('400', 'C1'): MARKET_STATES.CONTINUOUS_TRADING,
('400', 'D0'): MARKET_STATES.HALTED,
('400', 'D1'): MARKET_STATES.POST_TRADE}
I am trying to write a condition such that if the is_and_mp
is in the no_collision_issue_status
list, OR
the trading_status
is not
, then use the MARKET_STATES_DICT
to map a new column called market_state
.
Here is what I have written, but I get an error TypeError: unhashable type: 'Series'
:
market_info_df.loc[(market_info_df['is_and_mp'] in no_collision_issue_status) | (~market_info_df['trading_state'] == ' '),
'market_state'] = MARKET_STATES_DICT[(market_info_df['issue_status'], market_info_df['trading_state'])]
I understand what is wrong and why I am getting the error, but I am not sure how to fix it!
CodePudding user response:
Use apply function on dataframe. Check for the desired condition as you have written. If true then return the value from dict else return None:
market_info_df["market_state"] = market_info_df.apply(lambda row: MARKET_STATES_DICT[(row["is_and_mp"],row["trading_status"])] if row["is_and_mp"] in no_collision_issue_status or row["trading_status"] != " " else None, axis=1)
Full example with dummy data:
market_info_df = pd.DataFrame(data=[["10","0","B0"],["20","0"," "],["40","0","D1"]], columns=["issue_status", "market_phase", "trading_status"])
market_info_df['is_and_mp'] = market_info_df['issue_status'] market_info_df['market_phase']
no_collision_issue_status = ['000', '200', '203', '204', '300']
MARKET_STATES_DICT = {
('000', ' '): "CLOSED",
('100', ' ', 'F'): "OPENING_AUCTION",
('200', ' '): "CONTINUOUS_TRADING",
('203', ' '): "UNSCHEDULED_AUCTION",
('204', ' '): "UNSCHEDULED_AUCTION",
('100', 'B0'): "UNSCHEDULED_AUCTION",
('200', 'B1'): "CONTINUOUS_TRADING",
('400', 'C0'): "HALTED",
('400', 'C1'): "CONTINUOUS_TRADING",
('400', 'D0'): "HALTED",
('400', 'D1'): "POST_TRADE"}
market_info_df["market_state"] = market_info_df.apply(lambda row: MARKET_STATES_DICT[(row["is_and_mp"],row["trading_status"])] if row["is_and_mp"] in no_collision_issue_status or row["trading_status"] != " " else None, axis=1)
[Out]:
issue_status market_phase trading_status is_and_mp market_state
0 10 0 B0 100 UNSCHEDULED_AUCTION
1 20 0 200 CONTINUOUS_TRADING
2 40 0 D1 400 POST_TRADE