Suppose we have an indexed Dataframe with arbitrary but long number of columns:
from numpy.random import randint
import pandas as pd
df = pd.DataFrame(randint(0,100,size=(10, 4)), columns=list('ABCD'))
print(df)
> A B C D
> 0 78 1 97 98
> 1 93 58 46 45
> 2 50 1 77 27
> 3 63 87 66 21
> 4 26 1 10 46
> 5 26 60 71 79
> 6 74 4 62 98
> 7 93 22 23 89
> 8 30 31 14 46
> 9 51 4 90 22
And have a selector
, which contains which index need for each columns, like:
selector = pd.DataFrame({ "other_index": randint(len(df.index),size=len(df.columns))},
index=df.columns)
print(selector)
> other_index
> A 9
> B 0
> C 3
> D 4
Now I would like to get the
selected = [df[c].loc[selector.loc[c][0]] for c in df.columns]
print(selected)
> [51, 1, 66, 46]
I'm pretty sure there is a more efficient way in pandas to achieve this, but I can't find.
CodePudding user response:
IIUC, you could stack
and slice:
idx = zip(selector['other_index'], selector.index)
df.stack().loc[idx].to_list()
output: [51, 31, 46, 46]
CodePudding user response:
I would use df.lookup before it got deprecated in the future. :)
df = pd.DataFrame(randint(0,100,size=(10, 4)), columns=list('ABCD'))
A B C D
0 93 30 17 42
1 38 55 10 46
2 7 30 86 36
3 25 48 25 62
4 1 61 50 0
5 18 87 98 87
6 61 57 80 34
7 38 50 32 96
8 72 68 75 74
9 70 99 77 28
selector = pd.DataFrame({ "other_index": randint(len(df.index),size=len(df.columns))},
index=df.columns)
other_index
A 5
B 7
C 5
D 9
df.lookup(selector.other_index, selector.index)
array([18, 50, 98, 28])