I'm working with Shopify's Order API and I receive the following responses, which currently sit in a dataframe (very simplified).
---------- ----------- -----------------------------------------------------------------------------
| order_id | name | item_order |
---------- ----------- -----------------------------------------------------------------------------
| 123 | Jon Doe | [{'id':789,'price':40.0,'sku':'HAT'},{'id':790,'price':41.0,'sku':'SHIRT'}] |
---------- ----------- -----------------------------------------------------------------------------
| 122 | Jonah Doe | [{'id':553,'price':20.0,'sku':'CAP'}] |
---------- ----------- -----------------------------------------------------------------------------
| | | |
---------- ----------- -----------------------------------------------------------------------------
I'd like to explode item_order, so we get something closer to the below:
---------- ----------- --------------- ------------------ ----------------
| order_id | name | item_order.id | item_order.price | item_order.sku |
---------- ----------- --------------- ------------------ ----------------
| 123 | Jon Doe | 789 | 40.0 | HAT |
---------- ----------- --------------- ------------------ ----------------
| 123 | Jon Doe | 790 | 41.0 | SHIRT |
---------- ----------- --------------- ------------------ ----------------
| 122 | Jonah Doe | 553 | 20 | CAP |
---------- ----------- --------------- ------------------ ----------------
CodePudding user response:
Use:
df = df.explode('item_order').reset_index(drop=True)
df = df.join(pd.json_normalize(df.pop('item_order')))
CodePudding user response:
First explode
and then agg(pd.Series)
:
new_df = df.explode('item_order')
new_df[['item_order.id', 'item_order.price', 'item_order.sku']] = ( new_df['item_order'].
agg(pd.Series) )