In the following dataset, I need to convert each row for the “description” under “name" column (for example, inventory1, inventory2 and inventory3) into two separate columns (namely description1 and description2, respectively). If I used either pviot_table or groupby, the value of the description will become header instead of a value under a column. What would be the way to generate the desired output? Thanks
import pandas as pd
df1 = { 'item':['item1','item2','item3','item4','item5','item6'],
'name':['inventory1','inventory1','inventory2','inventory2','inventory3','inventory3'],
'code':[1,1,2,2,3,3],
'description':['sales number decrease compared to last month', 'Sales number
decreased','sales number increased','Sales number increased, need to keep kpi','no sales this
month','item out of stock']}
df1=pd.DataFrame(df1)
CodePudding user response:
You can actually use pd.concat
:
new_df = pd.concat([
(
df.drop_duplicates('name')
.drop('description', axis=1)
.reset_index(drop=True)
),
(
pd.DataFrame([pd.Series(l) for l in df.groupby('name')['description'].agg(list).tolist()])
.add_prefix('description')
),
],
axis=1)
Output:
>>> new_df
item name code description0 description1
0 item1 inventory1 1 sales number decrease compared to last month Sales number decreased
1 item3 inventory2 2 sales number increased Sales number increased, need to keep kpi
2 item5 inventory3 3 no sales this month item out of stock
One-liner version of the above, in case you want it:
pd.concat([df.drop_duplicates('name').drop('description', axis=1).reset_index(drop=True), pd.DataFrame([pd.Series(l) for l in df.groupby('name')['description'].agg(list).tolist()]).add_prefix('description')], axis=1)