I have a dataframe comprising the data and another dataframe, containing a single row carrying indices.
data = {'col_1': [4, 5, 6, 7], 'col_2': [3, 4, 9, 8],'col_3': [5, 5, 6, 9],'col_4': [8, 7, 6, 5]}
df = pd.DataFrame(data)
ind = {'ind_1': [2], 'ind_2': [1],'ind_3': [3],'ind_4': [2]}
ind = pd.DataFrame(ind)
Both have the same number of columns. I want to extract the values of df
corresponding to the index stored in ind
so that I get a single row at the end.
For this data it should be: [6, 4, 9, 6]
. I tried df.loc[ind.loc[0]]
but that of course gives me four different rows, not one.
The other idea I have is to zip
columns and rows and iterate over them. But I feel there should be a simpler way.
CodePudding user response:
you can go to NumPy domain and index there:
In [14]: df.to_numpy()[ind, np.arange(len(df.columns))]
Out[14]: array([[6, 4, 9, 6]], dtype=int64)
this pairs up 2, 1, 3, 2
from ind
and 0, 1, 2, 3
from 0 to number of columns - 1; so we get the values at [2, 0]
, [1, 1]
and so on.
There's also df.lookup
but it's being deprecated, so...
In [19]: df.lookup(ind.iloc[0], df.columns)
C:\Users\need-\Anaconda3\Scripts\ipython:1: FutureWarning: The 'lookup' method is deprecated and will beremoved in a future version.You can use DataFrame.melt and DataFrame.locas a substitute.
Out[19]: array([6, 4, 9, 6], dtype=int64)