Pandas from json dictionary column to a new seperate column-CodePudding

I have an dataset which contains JSON dictionary as a column. I want to parse this column to the new columns based on their keys. The column itself is an object and df.iloc gives me a string, so I couldn't figure out how to handle it. I tried json_normalize and tolist but apparently they were wrong.

    Unnamed: 0               _id                                           userInputs   sessionID    
222 222 5bc915caf9af8b0dad3c0660    [{'userID': 22, 'milesRequested': 170, 'WhPerM...   2_39_88_24_2018-04-30 15:07:48.608581

and userInputs:

c.iloc[0]['userInputs']
"[{'userID': 22, 'milesRequested': 170, 'WhPerMile': 350, 'minutesAvailable': 550, 'modifiedAt': 'Mon, 30 Apr 2018 15:08:54 GMT', 'paymentRequired': True, 'requestedDeparture': 'Tue, 01 May 2018 00:17:49 GMT', 'kWhRequested': 59.5}]"

So userID, milesRequested etc. will be added as a new column corresponding to their values for all dataset.

Dataset

CodePudding user response：

First, to convert the string to python object apply ast.literal_eval to the column, then convert the list of dict to dataframe columns:

from ast import literal_eval

df["userInputs"] = df["userInputs"].apply(literal_eval)

df = df.explode("userInputs")
df = pd.concat([df, df.pop("userInputs").apply(pd.Series)], axis=1)

print(df)

Prints:

   _id  userID  milesRequested  WhPerMile  minutesAvailable                     modifiedAt  paymentRequired             requestedDeparture  kWhRequested
0  xxx      22             170        350               550  Mon, 30 Apr 2018 15:08:54 GMT             True  Tue, 01 May 2018 00:17:49 GMT          59.5

DataFrame used:

   _id                                                                                                                                                                                                                               userInputs
0  xxx  [{'userID': 22, 'milesRequested': 170, 'WhPerMile': 350, 'minutesAvailable': 550, 'modifiedAt': 'Mon, 30 Apr 2018 15:08:54 GMT', 'paymentRequired': True, 'requestedDeparture': 'Tue, 01 May 2018 00:17:49 GMT', 'kWhRequested': 59.5}]