I currently have a dataframe of strain
and stress
, containing corresponding values. I want to slice the dataframe in a particular way - I want to find the max value in stress
, and then take the next 5 rows of the dataframe. (I don't want to just find all the highest values in the column and sort by that.) Here is what I'm doing currently:
import pandas as pd
df = pd.DataFrame({"strain": [1,2,4,6,2,4,7,4,8,3,4,7,3,3,6,4,7,4,3,2],
"stress": [0,0.2,0.5,0.8,0.7,1,0.7,0.6,0.7,0.8,0.4,0.2,0,-0.5,-0.8,-1,-0.8,-0.9,-0.7,-0.6]})
#Sort by stress values
new_df = df.copy()
new_df = new_df.sort_values(by = ['stress'], ascending = False)
new_df = new_df[0:5]
And this is my current output:
print(new_df)
strain stress
5 4 1.0
3 6 0.8
9 3 0.8
4 2 0.7
6 7 0.7
So my code is sorting by the highest values in stress. However, I want to main the row order behind the highest value in the column. This would be my expected output:
print(new_df)
strain stress
5 4 1.0
6 7 0.7
7 4 0.6
8 8 0.7
9 3 0.8
CodePudding user response:
You can use argmax
to find the index of the maximum:
imax = df.stress.argmax()
df.iloc[imax:imax 5]
Result:
strain stress
5 4 1.0
6 7 0.7
7 4 0.6
8 8 0.7
9 3 0.8