Home > Software design >  Python error: Boolean Series key will be reindexed to match DataFrame index
Python error: Boolean Series key will be reindexed to match DataFrame index

Time:10-19

I am working on a project in Python, and while my below code works I am getting this hideous error:

UserWarning: Boolean Series key will be reindexed to match DataFrame index.

dummy_types = pd.get_dummies(df_pokemon_co, columns=['Type 1', 'Type 2'])

df_pokemon_co['Rock'] = dummy_types['Type 1_Rock']   dummy_types['Type 2_Rock']
df_pokemon_co['Ground'] = dummy_types['Type 1_Ground']   dummy_types['Type 2_Ground']
df_pokemon_co['Water'] = dummy_types['Type 1_Water']   dummy_types['Type 2_Water']

df_pokemon_co['Sum'] = df_pokemon_co['Rock']   df_pokemon_co['Ground']   df_pokemon_co['Water']

print('Total rock:', np.sum(np.any(df_pokemon_co[df_pokemon_co['Sum']==2][df_pokemon_co['Rock']==1], axis=1)))
print('Total ground:', np.sum(np.any(df_pokemon_co[df_pokemon_co['Sum']==2][df_pokemon_co['Ground']==1], axis=1)))
print('Total water:', np.sum(np.any(df_pokemon_co[df_pokemon_co['Sum']==2][df_pokemon_co['Water']==1], axis=1)))

I have worked out that if I remove the following from the print lines the message goes away, but I am not quite sure how to remedy.

[df_pokemon_co['Sum']==2]

Does anyone have any idea on how to fix this? I have seen some other posts related to this error however the error seems to be getting issued for different reasons in those cases.

Thanks in advance :)

CodePudding user response:

If need count matched values by 2 conditions chain them by & for bitwise AND and use sum for Trues values:

print('Total rock:', ((df_pokemon_co['Sum']==2) & (df_pokemon_co['Rock']==1)).sum())

For median use DataFrame.loc for select by condition and column name:

med = (df_pokemon_co.loc[(df_pokemon_co['Rock']==1) & 
                         (df_pokemon_co['Sum']==2), 'Defense'].median())

For avoid call multiple times same mask assign it to variable:

mask1 = (df_pokemon_co['Sum']==2) & (df_pokemon_co['Rock']==1)

print('Total rock:', mask1.sum())
print('Median rock:', df_pokemon_co.loc[mask1, 'Defense'].sum())
  • Related