Home > other >  Finding values from different rows in pandas
Finding values from different rows in pandas

Time:09-06

I have a dataframe comprising the data and another dataframe, containing a single row carrying indices.

data = {'col_1': [4, 5, 6, 7], 'col_2': [3, 4, 9, 8],'col_3': [5, 5, 6, 9],'col_4': [8, 7, 6, 5]}
df = pd.DataFrame(data)

ind = {'ind_1': [2], 'ind_2': [1],'ind_3': [3],'ind_4': [2]}
ind = pd.DataFrame(ind)

Both have the same number of columns. I want to extract the values of df corresponding to the index stored in ind so that I get a single row at the end.

For this data it should be: [6, 4, 9, 6]. I tried df.loc[ind.loc[0]] but that of course gives me four different rows, not one.

The other idea I have is to zip columns and rows and iterate over them. But I feel there should be a simpler way.

CodePudding user response:

you can go to NumPy domain and index there:

In [14]: df.to_numpy()[ind, np.arange(len(df.columns))]
Out[14]: array([[6, 4, 9, 6]], dtype=int64)

this pairs up 2, 1, 3, 2 from ind and 0, 1, 2, 3 from 0 to number of columns - 1; so we get the values at [2, 0], [1, 1] and so on.


There's also df.lookup but it's being deprecated, so...

In [19]: df.lookup(ind.iloc[0], df.columns)
C:\Users\need-\Anaconda3\Scripts\ipython:1: FutureWarning: The 'lookup' method is deprecated and will beremoved in a future version.You can use DataFrame.melt and DataFrame.locas a substitute.

Out[19]: array([6, 4, 9, 6], dtype=int64)
  • Related