Groupby, transpose and top 5 python-CodePudding

I have a dataframe like this:

d = {'col1': ['Category1', 'Category1','Category1', 'Category1','Category1', 'Category2','Category2', 'Category2','Category2', 'Category2',], 
 'col2': ['Type1', 'Type2','Type3','Type4','Type5','Type1', 'Type2','Type3','Type4','Type5'], 
 'col3':[32,44,87,10,12,10,14,800,3200,35]}

df_test = pd.DataFrame(data=d)

df_test

col1        col2    col3
Category1   Type1   32
Category1   Type2   44
Category1   Type3   87
Category1   Type4   10
Category1   Type5   12
Category2   Type1   10
Category2   Type2   14
Category2   Type3   800
Category2   Type4   3200
Category2   Type5   35

What I'd like to get is the two rows Category 1 and Category 2 containing the sorted 3 largest Type based on col3 numbers. So the output would look like this:

col1       order1  order2  order3
Category1  Type3   Type2   Type1
Category2  Type4   Type3   Type5

I've tried:

df_test=df_test.groupby(['col1']).apply(lambda x: (x.groupby('col2')
                                      .sum()
                                      .sort_values('col3', ascending=False))
                                      .head(3)).T

And other different ways with no success. Any ideas? Thank you

CodePudding user response：

df_t = df.pivot(columns='col2', index='col1')
df_t = df_t.apply(lambda s: [x[1] for x in s.nlargest(3).index], axis=1)
print(df_t)

Output:

col1
Category1    [Type3, Type2, Type1]
Category2    [Type4, Type3, Type5]
dtype: object

CodePudding user response：

df_test['col3'] = df_test.groupby('col1')['col3'].rank(ascending = False)
df1 = df_test.query('col3 <= 3').reset_index(drop = True)
df1['col4'] = 'order'   df1['col3'].astype(int).astype(str)
df1.pivot('col1', 'col4', 'col2').reset_index()

col3       col1 order1 order2 order3
0     Category1  Type3  Type2  Type1
1     Category2  Type4  Type3  Type5