Home > Software design >  Convert Json Object to independent columns in pandas
Convert Json Object to independent columns in pandas

Time:06-07

I have the following csv file (converted to Dataframe)

asOfDate,name,details                                         |
2022-06-06,Vasanth,{"degree":"BE", "score":72.0, "subject" :"ECE"}  

And then, using the above file, I need to convert to the following using pandas

asOfDate,name,details.degree,details.score,details.subject
2022-06-06,Vasanth,BE,72.0,ECE

I tried to use json_normalize, but getting error "str" has no attribute values

Please let me know how to convert into prescribed format

CodePudding user response:

You can expand the json field to columns by applying json.loads to it, convert the values to a list and then join that to the rest of the dataframe, dropping the redundant details column:

df.join(pd.DataFrame(df['details'].apply(json.loads).tolist())).drop('details',axis=1)

Output:

     asOfDate     name degree  score subject
0  2022-06-06  Vasanth     BE   72.0     ECE

CodePudding user response:

I think you received the error because the data in the details column is still a sting not a json/dict object.

Here we print the type of the first element in the details column

df = pd.read_csv('df.csv')
print(type(df['details'][0]))
<class 'str'>
  1. So you can simply convert it to json/dict object as Nick suggested
df['details'] = df['details'].apply(json.loads)
print(type(df['details'][0]))
<class 'dict'>
  1. Use json_normalize to convert the json objects to DataFrame
df2 = pd.json_normalize(df['details'])
print(df2)
  degree  score subject
0     BE   72.0     ECE
1     BE   72.0     ECE
  1. we join the 2 dataframes and remove the details column
df = df.join(df2).drop('details', axis=1)
print(df)
     asOfDate     name degree  score subject
0  2022-06-06  Vasanth     BE   72.0     ECE
1  2022-06-06  Vasanth     BE   72.0     ECE

finally all in one line

import json
import pandas as pd

df = pd.read_csv('df.csv')
df = df.join(pd.json_normalize(df['details'].apply(json.loads))).drop('details', axis=1)
print(df)
     asOfDate     name degree  score subject
0  2022-06-06  Vasanth     BE   72.0     ECE
1  2022-06-06  Vasanth     BE   72.0     ECE
  • Related