Home > front end >  how to apply Numpy Vectorize instead of apply function
how to apply Numpy Vectorize instead of apply function

Time:10-19

I have a pandas Dataframe as follows

data = {

'ID' : [0,0,0,0,0,1],
        
'DAYS': [293,1111,3020,390,210,10],
 

}



df = pd.DataFrame(data, columns = ['ID','DAYS'])
    ID  DAYS
0   0   293
1   0   1111
2   0   3020
3   0   390
4   0   210
5   1   10

What I am trying to do is the simple apply function with the following condition and outputs column as boolean :

df['bool'] = df.apply(lambda x:( x['DAYS'] < 365),axis =1  )

and i would like to optimize this apply-lambda part.. I managed to do in numpy array

df['bool_numpy'] = np.where(df['DAYS'] <365 ,True ,False)

But I am struggling applying same thing for np.vectorize method.

def copy_filter(df):
    if df['DAYS'] <365:
        return True
    else:
        return False

a= np.vectorize(copy_filter, otypes = [bool])
df['bool_vectorize'] = a(df['DAYS'])

but gave me an error. Any help would be appreciated. and also, any other optimization technique on this problem would be great as well!

CodePudding user response:

You don't need apply nor vectorize for this:

df['bool'] = df['DAYS'] < 365

output:

   ID  DAYS   bool
0   0   293   True
1   0  1111  False
2   0  3020  False
3   0   390  False
4   0   210   True
5   1    10   True

CodePudding user response:

Change your function to

def copy_filter(x):
    if x <365:
        return True
    else:
        return False
a= np.vectorize(copy_filter, otypes = [bool])
df['bool_vectorize'] = a(df['DAYS'])
df
   ID  DAYS  bool_vectorize
0   0   293            True
1   0  1111           False
2   0  3020           False
3   0   390           False
4   0   210            True
5   1    10            True
  • Related