Home > Software design >  groupby iterate with sorting
groupby iterate with sorting

Time:06-23

Below is the code and console output.

import pandas as pd

#data
df= pd.DataFrame([{'col1':'a', 'is_open':0}, {'col1':'b', 'is_open':1}])

#1
df = df.sort_values('is_open',ascending=False).reset_index(drop=True)
# print(df)

#2
for i, d in df.groupby(['col1', 'is_open']):
    print(d)

  col1  is_open
1    a        0
  col1  is_open
0    b        1

I want is_open=1 to be printed out first like below.

  col1  is_open
0    b        1
  col1  is_open
1    a        0

Sorting dataframe before grouping did not work.

Any help would be appreciated.

CodePudding user response:

As Poiuy noted, groupby will sort by the grouping key... but it can be turned off.

df.groupby(['col1', 'is_open'], sort=False)

CodePudding user response:

You may want something like the snippet below. Pandas groupBy function defaults to sorting by group keys which are in this case the first column, followed by the second column. Once the values have been grouped and aggregated you can sort the result and iterate through the data frame.

df = df.groupby(['col1', 'is_open']).apply(lambda x: x).sort_values('is_open',ascending=False)

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html

CodePudding user response:

Below worked for me. Thank you all.

import pandas as pd

#data
df= pd.DataFrame([{'col1':'a', 'is_open':0}, {'col1':'b', 'is_open':1}])

#1
# df = df.sort_values('is_open',ascending=False).reset_index(drop=True)
df = df.sort_values('is_open',ascending=True)
# print(df)

#2
for i, d in df.groupby(['col1', 'is_open'], sort=False):
    print(d)
  • Related