data=pd.DataFrame([[1,2,3],[21,23,24],[31,32,33]])
i=[0,1,2] # this is same as the index
y=[1,2,0]
data.iloc[x,y] gives me a 3x3 df, which I do not need.
I need to run this on a large df and would like to get the ELEMENTS (1,1) , (2,2) , (3,0) of the dataframe: 2, 24,31 . So I'd like to have the most efficient solution. I can obviously use a for loop with iterrows or even something like: data.apply(lambda x: x.iloc[y[int(x.name)]],axis=1).values #
Is the for/apply already the fastest solution? Isn't there a more direct way of getting only the elements, not a slice, of a df when you have a list(/series/df) with index,column coordinates?
Thanks
CodePudding user response:
Based on the documentation, it seem that data.to_numpy()[x,y]
is a reasonable approach.