Pandas - Assign value to subset of dataframe, based on multiple conditions-CodePudding

So I have one large file covering all the markets, as well as a dict of symbols with the ticker as key. I only want to update the "submarket" column for "Mk 1" rows, which I know can easily be done with

table2.loc[table2['Market'] == "Mk 1" , ['Sub Market']]

I have been trying to assign "A" to submarket if the symbol (i.e. ticker) is in the dict other wise "B" but keep getting TypeError: unhashable type: 'Series'

dict:

key (ticker)	value (Name)
ABC	ABC Corp
XYZ	XYZ Corp

table 2

Market	Sub Market	Symbol
Mk1		ABC
Mk 1		ABC
Mk 1		123
Mk 2		123
Mk 3		XYZ

expected result

Market	Sub Market	Symbol
Mk1	A	ABC
Mk 1	A	ABC
Mk 1	B	123
Mk 2		123
Mk 3		XYZ

CodePudding user response：

Use isin and map:

df.loc[df['Market'].isin(['Mk 1', 'Mk1']), 'Sub Market'] = df['Symbol'].isin(dct).map({True:'A', False:'B'})

Output:

>>> df
  Market Sub Market Symbol
0    Mk1          A    ABC
1   Mk 1          A    ABC
2   Mk 1          B    123
3   Mk 2          B    123
4   Mk 3          A    XYZ

CodePudding user response：

One option is to use numpy.where with a boolean mask msk to filter the Market values equal to "Mk 1":

import numpy as np
msk = df2['Market'].isin(['Mk1', 'Mk 1'])
df2.loc[msk, 'Sub Market'] = np.where(df2.loc[msk, 'Symbol'].isin(dct['key (ticker)']), 'A', 'B')

Output:

  Market Sub Market Symbol
0    Mk1          A    ABC
1   Mk 1          A    ABC
2   Mk 1          B    123
3   Mk 2        NaN    123
4   Mk 3        NaN    XYZ

Note that it's spelt both Mk1 and Mk 1 (and I assume they are the same market, since you're treating them the same); that's why we need to use isin to construct msk, otherwise, it could be df2['Market']=='Mk 1'.