Home > Back-end >  Why does np.select not allow me to put in index above total length into choicelist?
Why does np.select not allow me to put in index above total length into choicelist?

Time:03-16

I am trying to get the first value of the the list in each row of df['Emails'] but in real life (this is a sample df) I don't know what the length of the list will be so I am just assuming that the longest will be length of 5 and then trying to whittle it down until I find the right length and selecting that index position but I am getting IndexError: index 5 is out of bounds for axis 0 with size 2 and I can't figure out what to do about it. Any help appreciated. Thanks.

my current code:

df = pd.DataFrame({'Emails': [['[email protected]', '[email protected]', '[email protected]'],[None, '[email protected]']],
                   'num_wings': [2, 0],
                   'num_specimen_seen': [10, 2]},
                  index=['falcon', 'dog'])
df['Emails'] = np.select([df['Emails'][0],df['Emails'][1],df['Emails'][2]],[df['Emails'][0],df['Emails'][1],df['Emails'][2]])
print(data['Emails'])

Expected output:

Assuming the original dataframe has None in the first index position I want it to take the next index position that isn't None

Desired Output

              Emails  num_wings  num_specimen_seen
falcon   [email protected]          2                 10
dog     [email protected]          0                  2

CodePudding user response:

Whenever you have a column containing lists, explode will often be your friend, and this is the case here.

Use explode, groupby(level=0) (to group on the 0th (first) level of the index), and first (which selects the first non-null value (including None, NaN, etc.))

df['Emails'] = df['Emails'].explode().groupby(level=0).first()

Output:

>>> df
               Emails  num_wings  num_specimen_seen
falcon    [email protected]          2                 10
dog     [email protected]          0                  2
  • Related