Home > front end >  Groupby with multiple columns and then create plot for top N
Groupby with multiple columns and then create plot for top N

Time:09-21

I have a dataset which I grouped by using 2 columns. Now I want to plot the graph for top N based on 1 column. To explain it better below are the example data set. This data set is created from main data set using groupby

Data1 Data2 Value
A x 6
A y 7
A z 8
B y 3
B z 4
B u 5
C x 6
C y 7
C v 8
D v 4
D y 5
D z 7
E t 8
E u 7
E x 6
F s 4
F s 5
F r 6

Now I want only top 3 data1 to create new data set and to plot the seaborn graph. Below is the desire result.

Data1 Data2 Value
A x 6
A y 7
A z 8
B y 3
B z 4
B u 5
C x 6
C y 7
C v 8

CodePudding user response:

IIUC, you want to keep the first N groups of Data1?

You can use unique and slice it to get the first N groups in order, then use boolean indexing:

N = 3
out = df[df['Data1'].isin(df['Data1'].unique()[:N])]

Other option using itertools.islice and pandas.concat on the groupby (less efficient):

from itertools import islice
out = pd.concat([g for _,g in islice(df.groupby('Data1'), 3)])

output:

  Data1 Data2  Value
0     A     x      6
1     A     y      7
2     A     z      8
3     B     y      3
4     B     z      4
5     B     u      5
6     C     x      6
7     C     y      7
8     C     v      8
  • Related