Home > Software design >  numpy conditional array filtering with Enum
numpy conditional array filtering with Enum

Time:10-11

I have an array of tuples that represents peoples' names and genders.

from enum import Enum

class Gender(Enum):
    MALE = 1,
    FEMALE = 2

people = np.array(
[
('John Smith', Gender.MALE),
('Samantha Wheeler', Gender.FEMALE),
]

I'm trying to filter them by gender like so:

guys = np.where(people[1] == Gender.MALE)
girls = np.where(people[1] == Gender.FEMALE)

Doesn't seem to work even though the condition seems fine. What am I doing wrong?

CodePudding user response:

You want check column 1 in any row you need this [:,1] like below:

>>> people = np.array([('John Smith', Gender.MALE),('Samantha Wheeler', Gender.FEMALE),
...                   ('John Smith', Gender.MALE),('Samantha Wheeler', Gender.FEMALE)])
>>> guys = np.where(people[:,1] == Gender.MALE)
>>> girls = np.where(people[:,1] == Gender.FEMALE)

>>> girls
(array([1, 3]),)
>>> people[girls][:,0]
array(['Samantha Wheeler', 'Samantha Wheeler'], dtype=object)


# second approach
>>> row_guys, columns_guys = np.where(people == Gender.MALE)
>>> people[row_guys][:,0]
array(['John Smith', 'John Smith'], dtype=object)

CodePudding user response:

NumPy isn't ideal for mixed dtype data like the one you have. You should be using Pandas instead:

import pandas as pd

df = pd.DataFrame({
    'name': ['John Smith', 'Samantha Wheeler'],
    'gender': ['male', 'female'],
})

# this step is optional and is basically analog to using an Enum
df['gender'] = df['gender'].astype('category')

print(df[df['gender'] == 'male'])
#          name gender
# 0  John Smith   male

print(df[df['gender'] == 'female'])
#                name  gender
# 1  Samantha Wheeler  female
  • Related