I have a dataset which I grouped by using 2 columns. Now I want to plot the graph for top N based on 1 column. To explain it better below are the example data set. This data set is created from main data set using groupby
Data1 | Data2 | Value |
---|---|---|
A | x | 6 |
A | y | 7 |
A | z | 8 |
B | y | 3 |
B | z | 4 |
B | u | 5 |
C | x | 6 |
C | y | 7 |
C | v | 8 |
D | v | 4 |
D | y | 5 |
D | z | 7 |
E | t | 8 |
E | u | 7 |
E | x | 6 |
F | s | 4 |
F | s | 5 |
F | r | 6 |
Now I want only top 3 data1 to create new data set and to plot the seaborn graph. Below is the desire result.
Data1 | Data2 | Value |
---|---|---|
A | x | 6 |
A | y | 7 |
A | z | 8 |
B | y | 3 |
B | z | 4 |
B | u | 5 |
C | x | 6 |
C | y | 7 |
C | v | 8 |
CodePudding user response:
IIUC, you want to keep the first N groups of Data1?
You can use unique
and slice it to get the first N groups in order, then use boolean indexing:
N = 3
out = df[df['Data1'].isin(df['Data1'].unique()[:N])]
Other option using itertools.islice
and pandas.concat
on the groupby
(less efficient):
from itertools import islice
out = pd.concat([g for _,g in islice(df.groupby('Data1'), 3)])
output:
Data1 Data2 Value
0 A x 6
1 A y 7
2 A z 8
3 B y 3
4 B z 4
5 B u 5
6 C x 6
7 C y 7
8 C v 8