I currently have a data frame that has 3 columns and looks like this.
stations neighbourhoods Distance
9 68 0.429490
9 74 0.542039
9 66 0.604525
20 504 0.710707
20 83 0.785087
28 31 7.116540
28 30 7.839042
28 34 9.190559
28 26 10.320423
As you can see, stations column consists of 3 groups = 9,20,28. They all have corresponding distances according to a neighbourhood. The distances are already sorted according to their respective stations. What I want to achieve is to extract each minimum and maximum value according to their station, and put it into another dataframe.
Desired dataframe must look like this.
stations Minimum Distance Maximum distance
9 0.429490 0.604525
20 0.710707 0.785087
28 7.116540 10.320423
So far I have tried this code df_max = df.groupby('Distance').idxmax()
for max amounts but it did not work. How can I go on? Thanks in advance!
CodePudding user response:
Use groupby
agg
with min
/max
:
out = df.groupby('stations')['Distance'].agg(['min', 'max']).reset_index()
output:
stations min max
0 9 0.429490 0.604525
1 20 0.710707 0.785087
2 28 7.116540 10.320423
Or using named aggregation:
(df.groupby('stations')
.agg(**{'Minimum Distance': ('Distance', 'min'),
'Maximum distance': ('Distance', 'max')})
.reset_index()
)
output:
stations Minimum Distance Maximum distance
0 9 0.429490 0.604525
1 20 0.710707 0.785087
2 28 7.116540 10.320423