I have a dataframe, let's use this example:
df = pd.DataFrame({'A': [5,6,3,4], 'B': [1,2,3,5]})
df
A B
0 5 1
1 6 2
2 3 3
3 4 5
and I have a list of columns (by index) that I want to select, let's say:
list=['A','B','B','A']
What I want to obtain is [5,2,3,4]
, a series if possible. How can I do this?
I tried with masks, but I couldn't make it work.
CodePudding user response:
You can use lookup:
df.values[range(len(df)), df.columns.get_indexer_for(l)]
array([5, 2, 3, 4], dtype=int64)
CodePudding user response:
You could create a lookup using numpy advanced indexing:
import numpy as np
idx, cols = pd.factorize(lst)
out = out = df.reindex(cols, axis=1).to_numpy()[range(len(df)), idx].tolist()
Output:
[5, 2, 3, 4]