For Pandas Dataframe is there a way to display same category together as one while retaining all the other values in string?
Assuming I have the following Scenario:
pd.DataFrame({"category": ['Associates', 'Manager', 'Associates', 'Associates', 'Engineer', 'Engineer', 'Manager', 'Engineer'],
"name": ['Abby', 'Jenny', 'Thomas', 'John', 'Eve', 'Danny', 'Kenny', 'Helen'],
"email": ['[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]']})
How can I attempt to display the dataframe in a this way?
Output:
category name email
Associates Abby [email protected]
Thomas [email protected]
John [email protected]
Manager Jenny [email protected]
Kenny [email protected]
Engineer Eve [email protected]
Danny [email protected]
Helen [email protected]
Any advise, or can it be done with groupby functions? Thanks!
CodePudding user response:
For this, you will have two line of codes:
First, you need to set both your category
and name
as index
df.set_index(['category','name'],inplace=True)
Next, you will use groupby.sum
to get your desired output.
df.groupby(level=[0,1]).sum()
Out[67]:
email
category name
Associates Abby [email protected]
John [email protected]
Thomas [email protected]
Engineer Danny [email protected]
Eve [email protected]
Helen [email protected]
Manager Jenny [email protected]
Kenny [email protected]
CodePudding user response:
For this, you can use groupby()
function. Showing below is the sample code.
df.groupby(['category','name']).max()
Now the data is in indexed format and will be in the same format that you mentioned, if you want to remove the index, use the below code
df.groupby(['category','name']).max().reset_index()