Home > Back-end >  Pandas - Assign value to subset of dataframe, based on multiple conditions
Pandas - Assign value to subset of dataframe, based on multiple conditions

Time:03-16

So I have one large file covering all the markets, as well as a dict of symbols with the ticker as key. I only want to update the "submarket" column for "Mk 1" rows, which I know can easily be done with

table2.loc[table2['Market'] == "Mk 1" , ['Sub Market']]

I have been trying to assign "A" to submarket if the symbol (i.e. ticker) is in the dict other wise "B" but keep getting TypeError: unhashable type: 'Series'

dict:

key (ticker) value (Name)
ABC ABC Corp
XYZ XYZ Corp

table 2

Market Sub Market Symbol
Mk1 ABC
Mk 1 ABC
Mk 1 123
Mk 2 123
Mk 3 XYZ

expected result

Market Sub Market Symbol
Mk1 A ABC
Mk 1 A ABC
Mk 1 B 123
Mk 2 123
Mk 3 XYZ

CodePudding user response:

Use isin and map:

df.loc[df['Market'].isin(['Mk 1', 'Mk1']), 'Sub Market'] = df['Symbol'].isin(dct).map({True:'A', False:'B'})

Output:

>>> df
  Market Sub Market Symbol
0    Mk1          A    ABC
1   Mk 1          A    ABC
2   Mk 1          B    123
3   Mk 2          B    123
4   Mk 3          A    XYZ

CodePudding user response:

One option is to use numpy.where with a boolean mask msk to filter the Market values equal to "Mk 1":

import numpy as np
msk = df2['Market'].isin(['Mk1', 'Mk 1'])
df2.loc[msk, 'Sub Market'] = np.where(df2.loc[msk, 'Symbol'].isin(dct['key (ticker)']), 'A', 'B')

Output:

  Market Sub Market Symbol
0    Mk1          A    ABC
1   Mk 1          A    ABC
2   Mk 1          B    123
3   Mk 2        NaN    123
4   Mk 3        NaN    XYZ

Note that it's spelt both Mk1 and Mk 1 (and I assume they are the same market, since you're treating them the same); that's why we need to use isin to construct msk, otherwise, it could be df2['Market']=='Mk 1'.

  • Related