Home > Enterprise >  Flattening dataframe Json column to new rows in Python
Flattening dataframe Json column to new rows in Python

Time:10-22

I am new to Python. I have dataframe obtained from SQL query result

UserId UserName Reason_details
851 Bob [ {"reasonId":264, "reasonDescription":"prohibited", "reasonCodes":[1 , 2]} , {"reasonId":267, "reasonDescription":"Expired", "reasonCodes":[25]} ]
852 Jack [{"reasonId":273, "reasonDescription":"Restricted", "reasonCodes":[29]}]

I want to modify this dataframe by flattening Reason_details column. Each reason in new row.

UserId UserName Reason_id Reson_description Reason_codes
851 Bob 264 Prohibited 1
851 Bob 264 Prohibited 2
851 Bob 267 Expired 25
852 Jack 273 Restricted 29

I flattened this data using good old for loops iterating over each row of source dataframe, reading value of each key in Reason_details column by using json_loads. And then creating final dataframe.

But I feel there has to be better way of doing this using dataframe and JSON functions in python.

PS: In my actual dataset there are 63 columns and 8 million rows out of which only Reason_details column has JSON value. Thus my existing approach is very inefficient iteration over all rows, all columns converting them in 2D list first and making final dataframe from it.

CodePudding user response:

can you try this:

df=df.explode('Reason_details')
df = df.join(df['Reason_details'].apply(pd.Series)).drop('Reason_details',axis=1).explode('reasonCodes').drop_duplicates()

CodePudding user response:

here is a slight different manner

df[['UserId', 'UserName']].merge(df['Reason_details']
                                 .explode()      # convert list to rows
                                 .apply(pd.Series) # creates dict keys as column
                                 .explode('reasonCodes'), # convert reason code into rows
                                 left_index=True, # merge with original DF
                                 right_index=True)
    UserId  UserName    reasonId    reasonDescription   reasonCodes
0      851  Bob              264    prohibited           1
0      851  Bob              264    prohibited           2
0      851  Bob              267    Expired             25
1      852  Jack             273    Restricted          29

  • Related