For Pandas Dataframe is there a way to display same category together as one while retaining all the-CodePudding

For Pandas Dataframe is there a way to display same category together as one while retaining all the other values in string?

Assuming I have the following Scenario:

pd.DataFrame({"category": ['Associates', 'Manager', 'Associates', 'Associates', 'Engineer', 'Engineer', 'Manager', 'Engineer'],
              "name": ['Abby', 'Jenny', 'Thomas', 'John', 'Eve', 'Danny', 'Kenny', 'Helen'],
              "email": ['[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]']})

How can I attempt to display the dataframe in a this way?

Output:

category     name     email
Associates   Abby     [email protected]
             Thomas   [email protected]
             John     [email protected]
Manager      Jenny    [email protected]
             Kenny    [email protected]
Engineer     Eve      [email protected]
             Danny    [email protected]
             Helen    [email protected]

Any advise, or can it be done with groupby functions? Thanks!

CodePudding user response：

For this, you will have two line of codes: First, you need to set both your category and name as index

df.set_index(['category','name'],inplace=True)

Next, you will use groupby.sum to get your desired output.

df.groupby(level=[0,1]).sum()
Out[67]: 
                              email
category   name                    
Associates Abby      [email protected]
           John      [email protected]
           Thomas  [email protected]
Engineer   Danny    [email protected]
           Eve        [email protected]
           Helen    [email protected]
Manager    Jenny    [email protected]
           Kenny    [email protected]

CodePudding user response：

For this, you can use groupby() function. Showing below is the sample code.

df.groupby(['category','name']).max()

Now the data is in indexed format and will be in the same format that you mentioned, if you want to remove the index, use the below code

df.groupby(['category','name']).max().reset_index()