Home > other >  How to convert rows into columns (as value but not header) in Python
How to convert rows into columns (as value but not header) in Python

Time:03-21

In the following dataset, I need to convert each row for the “description” under “name" column (for example, inventory1, inventory2 and inventory3) into two separate columns (namely description1 and description2, respectively). If I used either pviot_table or groupby, the value of the description will become header instead of a value under a column. What would be the way to generate the desired output? Thanks

import pandas as pd
df1 = { 'item':['item1','item2','item3','item4','item5','item6'],
  'name':['inventory1','inventory1','inventory2','inventory2','inventory3','inventory3'],
  'code':[1,1,2,2,3,3],
  'description':['sales number decrease compared to last month', 'Sales number 
decreased','sales number increased','Sales number increased, need to keep kpi','no sales this 
month','item out of stock']}

df1=pd.DataFrame(df1)

desired output as below: enter image description here

CodePudding user response:

You can actually use pd.concat:

new_df = pd.concat([
        (
            df.drop_duplicates('name')
            .drop('description', axis=1)
            .reset_index(drop=True)
        ),
        (
            pd.DataFrame([pd.Series(l) for l in df.groupby('name')['description'].agg(list).tolist()])
            .add_prefix('description')
        ),
    ],
    axis=1)

Output:

>>> new_df
    item        name  code                                  description0                              description1
0  item1  inventory1     1  sales number decrease compared to last month                    Sales number decreased
1  item3  inventory2     2                        sales number increased  Sales number increased, need to keep kpi
2  item5  inventory3     3                           no sales this month                         item out of stock
One-liner version of the above, in case you want it:
pd.concat([df.drop_duplicates('name').drop('description', axis=1).reset_index(drop=True), pd.DataFrame([pd.Series(l) for l in df.groupby('name')['description'].agg(list).tolist()]).add_prefix('description')], axis=1)
  • Related