Home > Back-end >  Exploding a list of dictionaries in pandas to multiple rows and columns
Exploding a list of dictionaries in pandas to multiple rows and columns

Time:10-27

I'm working with Shopify's Order API and I receive the following responses, which currently sit in a dataframe (very simplified).

 ---------- ----------- ----------------------------------------------------------------------------- 
| order_id | name      | item_order                                                                  |
 ---------- ----------- ----------------------------------------------------------------------------- 
| 123      |  Jon Doe  | [{'id':789,'price':40.0,'sku':'HAT'},{'id':790,'price':41.0,'sku':'SHIRT'}] |
 ---------- ----------- ----------------------------------------------------------------------------- 
| 122      | Jonah Doe |                    [{'id':553,'price':20.0,'sku':'CAP'}]                    |
 ---------- ----------- ----------------------------------------------------------------------------- 
|          |           |                                                                             |
 ---------- ----------- ----------------------------------------------------------------------------- 

I'd like to explode item_order, so we get something closer to the below:

 ---------- ----------- --------------- ------------------ ---------------- 
| order_id | name      | item_order.id | item_order.price | item_order.sku |
 ---------- ----------- --------------- ------------------ ---------------- 
| 123      |  Jon Doe  | 789           | 40.0             | HAT            |
 ---------- ----------- --------------- ------------------ ---------------- 
| 123      | Jon Doe   | 790           | 41.0             | SHIRT          |
 ---------- ----------- --------------- ------------------ ---------------- 
| 122      | Jonah Doe | 553           | 20               | CAP            |
 ---------- ----------- --------------- ------------------ ---------------- 

CodePudding user response:

Use:

df = df.explode('item_order').reset_index(drop=True)
df = df.join(pd.json_normalize(df.pop('item_order')))

CodePudding user response:

First explode and then agg(pd.Series):

new_df = df.explode('item_order')
new_df[['item_order.id', 'item_order.price', 'item_order.sku']] = ( new_df['item_order'].
                                                              agg(pd.Series) )
  • Related