The objective is to use apply
to slice a range of array
and apply NumPy argmax
.
However, the compiler returns TypeError
TypeError: slice indices must be integers or None or have an __index__ method
For the following code
import pandas as pd import numpy as np
arr=np.array([10,2,5,3,6,8,3,3,2,5,6,8,11,14,11,100,1,3,20,21])
arr=arr.reshape((1,-1))
df=pd.DataFrame(zip([4,7],[15,18],[25,40]),columns=['lb','rb','mv'])
df['ss'] = df.apply(lambda x: np.argmax(arr[0][x['lb']:x['rb']] >= 0.3 * x['mv'] ), axis=1)
df['cc']=0.3 *df['mv']
df['es'] = df.apply(lambda x: np.argmax(arr[0][x['ss']:-1] < x['cc'] ), axis=1)
However, if I modify the two lines
df['cc']=0.3 *df['mv']
df['es'] = df.apply(lambda x: np.argmax(arr[0][x['ss']:-1] < x['cc'] ), axis=1)
into
df['es'] = df.apply(lambda x: np.argmax(arr[0][x['ss']:-1] < 0.3 *x['mv'] ), axis=1)
The program works like a charm.
I am curious why this issue arises.
CodePudding user response:
The problem is that multiplying by floats returns floats for each row if a loop is being used via apply
with axis=1
:
df['cc']=0.3 *df['mv']
def f(x):
print(x)
lb 4.0
rb 15.0
mv 25.0
ss 1.0
cc 7.5
Name: 0, dtype: float64
lb 7.0
rb 18.0
mv 40.0
ss 6.0
cc 12.0
Name: 1, dtype: float64
df['es'] = df.apply(f , axis=1)
If all integers columns outputs are integers for each row:
def f(x):
print(x)
lb 4
rb 15
mv 25
ss 1
Name: 0, dtype: int64
lb 7
rb 18
mv 40
ss 6
Name: 1, dtype: int64
df['es'] = df.apply(f, axis=1)
Therefore, converting x['ss']
to integer will work correctly:
df['cc']=0.3 *df['mv']
df['es'] = df.apply(lambda x: np.argmax(arr[0][int(x['ss']):-1] < x['cc'] ), axis=1)
print (df)
lb rb mv ss cc es
0 4 15 25 1 7.5 0
1 7 18 40 6 12.0 0