I am sincerely sorry, if this is duplicated. I have searched a long time and still get error: "TypeError: _select_dispatcher() got an unexpected keyword argument 'na'" or "TypeError: invalid entry 0 in condlist: should be boolean ndarray".
I have a dataframe:
data_1 = {'A': ['Emo/3', 'Emo/4', 'Emo/1','Emo/3', '','Emo/3', 'Emo/4', 'Emo/1','Emo/3', '', 'Neu/5', 'Neu/2','Neu/5', 'Neu/2'],
'Pos': ["repeat3", "repeat3", "repeat3", "repeat3", '',"repeat1", "repeat1", "repeat1", "repeat1", '', "repeat2", "repeat2","repeat2", "repeat2"],
'B': [0, 0, 0, 0, '', 1, 2, 3, 4, '', 4, 2, 3, 1],'C': [0, 2, 1, 3, '', 4, 2, 3, 1, '', 4, 2, 3, 1]}
df_1 = pd.DataFrame(data_1)
df_1
A Pos B C
0 Emo/3 repeat3 0 0
1 Emo/4 repeat3 0 2
2 Emo/1 repeat3 0 1
3 Emo/3 repeat3 0 3
4
5 Emo/3 repeat1 1 4
6 Emo/4 repeat1 2 2
7 Emo/1 repeat1 3 3
8 Emo/3 repeat1 4 1
9
10 Neu/5 repeat2 4 4
11 Neu/2 repeat2 2 2
12 Neu/5 repeat2 3 3
13 Neu/2 repeat2 1 1
I want to create a column D based on column B and C. If the satified with the criteria, put a number, if not leave as empty. Here is my code:
conditions = [
df_1.loc[(df_1['B']==1)&(df_1['C']==1)],
df_1.loc[(df_1['B']==2)&(df_1['C']==1)],
df_1.loc[(df_1['B']==3)&(df_1['C']==1)],
]
choices = [1,1,0]
df_1['D'] = np.select(conditions, choices, default='')
Thanks!
CodePudding user response:
You shouldn't be using .loc
in your conditions.
Also, it's not a good idea to mix strings and numbers in a column, so you should set your default value to NaN
instead of ''
.
Try:
conditions = [(df_1['B']==1)&(df_1['C']==1),
(df_1['B']==2)&(df_1['C']==1),
(df_1['B']==3)&(df_1['C']==1)]
choices = [1,1,0]
df_1['D'] = np.select(conditions, choices, default=np.nan)
>>> df_1
A Pos B C D
0 Emo/3 repeat3 0 0 NaN
1 Emo/4 repeat3 0 2 NaN
2 Emo/1 repeat3 0 1 NaN
3 Emo/3 repeat3 0 3 NaN
4 NaN
5 Emo/3 repeat1 1 4 NaN
6 Emo/4 repeat1 2 2 NaN
7 Emo/1 repeat1 3 3 NaN
8 Emo/3 repeat1 4 1 NaN
9 NaN
10 Neu/5 repeat2 4 4 NaN
11 Neu/2 repeat2 2 2 NaN
12 Neu/5 repeat2 3 3 NaN
13 Neu/2 repeat2 1 1 1.0