Home > Enterprise >  Pandas Return TypeError when slicing numpy error using apply and lambda
Pandas Return TypeError when slicing numpy error using apply and lambda

Time:02-25

The objective is to use apply to slice a range of array and apply NumPy argmax.

However, the compiler returns TypeError

TypeError: slice indices must be integers or None or have an __index__ method 

For the following code

import pandas as pd import numpy as np

arr=np.array([10,2,5,3,6,8,3,3,2,5,6,8,11,14,11,100,1,3,20,21])
arr=arr.reshape((1,-1))
df=pd.DataFrame(zip([4,7],[15,18],[25,40]),columns=['lb','rb','mv'])
df['ss'] = df.apply(lambda x: np.argmax(arr[0][x['lb']:x['rb']] >= 0.3 * x['mv'] ), axis=1)
df['cc']=0.3 *df['mv']
df['es'] = df.apply(lambda x: np.argmax(arr[0][x['ss']:-1] < x['cc'] ), axis=1)

However, if I modify the two lines

df['cc']=0.3 *df['mv']
df['es'] = df.apply(lambda x: np.argmax(arr[0][x['ss']:-1] < x['cc'] ), axis=1)

into

df['es'] = df.apply(lambda x: np.argmax(arr[0][x['ss']:-1] < 0.3 *x['mv'] ), axis=1)

The program works like a charm.

I am curious why this issue arises.

CodePudding user response:

The problem is that multiplying by floats returns floats for each row if a loop is being used via apply with axis=1:

df['cc']=0.3 *df['mv']

def f(x):
    print(x)
lb     4.0
rb    15.0
mv    25.0
ss     1.0
cc     7.5
Name: 0, dtype: float64
lb     7.0
rb    18.0
mv    40.0
ss     6.0
cc    12.0
Name: 1, dtype: float64
df['es'] = df.apply(f  , axis=1)

If all integers columns outputs are integers for each row:

def f(x):
    print(x)
lb     4
rb    15
mv    25
ss     1
Name: 0, dtype: int64
lb     7
rb    18
mv    40
ss     6
Name: 1, dtype: int64

df['es'] = df.apply(f, axis=1)

Therefore, converting x['ss'] to integer will work correctly:

df['cc']=0.3 *df['mv']
df['es'] = df.apply(lambda x: np.argmax(arr[0][int(x['ss']):-1] < x['cc'] ), axis=1)

print (df)
   lb  rb  mv  ss    cc  es
0   4  15  25   1   7.5   0
1   7  18  40   6  12.0   0
  • Related