Home > database >  I have a dataframe with a json substring in 1 of the columns. i want to extract variables and make c
I have a dataframe with a json substring in 1 of the columns. i want to extract variables and make c

Time:05-14

imports json    
df = pd.read_json("C:/xampp/htdocs/PHP code/APItest.json", orient='records')
print(df)

DataFrame

I would like to create three columns extra: ['name','logo','ico_score'] instead of the 'main' column

I have tried:

df2 = df.join(pd.DataFrame(list(json.loads(d).values())[0] for d in df.pop('main')) )

but gets this TypeError:

the JSON object must be str, bytes or bytearray, not dict

I hope someone can help me find a way, so i can end up with a data table for statistics. Best regards the student.

This is how my json data looks like:

[
    {
        "id": "126",
        "main": {"name": "SONM", "logo": "link", "ico_score": "6.7"},
        "links": {"url": "link"},
        "finance": {"raised": "42000000"},
    },
    {
        "id": "132",
        "main": {"name": "openANX", "logo": "link", "ico_score": "5.7"},
        "links": {"url": "link"},
        "finance": {"raised": "18756937"},
    },
    {
        "id": "166",
        "main": {"name": "Boul\\u00e9", "logo": "link", "ico_score": "5.6"},
        "links": {"url": "link"},
        "finance": {"raised": ""},
    },
]  

CodePudding user response:

IIUC you can do it like this:

with open('your_json_file.json') as f:
    data = json.load(f)

df = pd.json_normalize(data)
df.columns = ['id', 'name', 'logo', 'ico_score', 'url', 'raised']
print(df)

    id        name  logo ico_score   url    raised
0  126        SONM  link       6.7  link  42000000
1  132     openANX  link       5.7  link  18756937
2  166  Boul\u00e9  link       5.6  link 
  • Related