I am very new to python and I don’t know how to proceed. I have an array and a DataFrame:
Column = np.array([3,1,3,2,4])
df = pd.DataFrame({
1:[1,2,3,4,5],
2:['A','B','C','D','E'],
3:[6,7,8,9,0],
4:['F','G','H','I','J']
})
1 2 3 4
0 1 A 6 F
1 2 B 7 G
2 3 C 8 H
3 4 D 9 I
4 5 E 0 J
I would like to extract the values from the dataframe by iterating each row and use the array values to determine which column to extract the data and come up with a result [6,2,8,’D’,’J’]
6 2 8 D J
CodePudding user response:
Use numpy indexing:
out = df.to_numpy()[np.arange(len(df)), Column-1]
NB. python indexing starts from 0 so we need to subtract 1 to Column
output: array([6, 2, 8, 'D', 'J'], dtype=object)
CodePudding user response:
If your want to perform this same operation with pure pandas functions and label based
import pandas as pd
index = pd.MultiIndex.from_arrays([df.index, Column])
out = df.stack().loc[index]
print(out)
0 3 6
1 1 2
2 3 8
3 2 D
4 4 J
dtype: object
print(out.to_numpy())
[6 2 8 'D' 'J']