Home > OS >  how to use loc function in pandas to apply exactly exact rows
how to use loc function in pandas to apply exactly exact rows

Time:03-22

I have a simple dataframe like below:

dict1 = {'student name':['A','B','C','D','E'],
            'math':[0,9,0,6,7],
            'Eng' :[0,6,7,8,9],
             'Chemical':[0,4,5,7,8]
        }
df = pd.DataFrame(dict1)
df2 = df.loc[(df[['math']]==0).any(1),'mark'].apply(lambda x: x['mark'] =='Fail')

I would like to create a new column 'mark' , which put 'Fail' for these row with 'math' = 0 , but when ran the code, I got the error below. Could you please help assist for my issue ? expected output:

   student name math    Eng Chemical    mark
0   A   0   0   0   Fail
1   C   9   6   4   Pass
2   D   0   7   5   Fail
3   E   6   8   7   Pass
4   F   7   9   8   Pass

KeyError                                  Traceback (most recent call last)
File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexes\base.py:3621, in Index.get_loc(self, key, method, tolerance)
   3620 try:
-> 3621     return self._engine.get_loc(casted_key)
   3622 except KeyError as err:

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\_libs\index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\_libs\index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()

File pandas\_libs\hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas\_libs\hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'mark'

CodePudding user response:

Use numpy.where:

import numpy as np
df2 = df.assign(mark=np.where(df['math'].eq(0),
                              'fail', 'pass'))

Output:

  student name  math  Eng  Chemical  mark
0            A     0    0         0  Fail
1            B     9    6         4  Pass
2            C     0    7         5  Fail
3            D     6    8         7  Pass
4            E     7    9         8  Pass

If you want to have fail if there is a zero in any subject use:

cols = ['math', 'Eng', 'Chemical']

df2 = df.assign(mark=np.where(df[cols].eq(0).any(axis=1),
                        'Fail', 'Pass'))
  • Related