Home > Net >  flatten JSON dataframe in python
flatten JSON dataframe in python

Time:12-14

I have made an API request and I am receiving the JSON in the nested format below (along with what I expected).

I dont often have to flatten JSON data & when I do, I just use Json_normalize. I have tried to use json_normalize, but it hasnt had any effect this time.

Any help would be much appreciated.

#ACTUAL
[
    {
        "id": 1000,
        "tableName": {
            "": {
                "field1": null,
                "field2": null,
            }
        }
    },
{
        "id": 1001,
        "tableNameTwo": {
            "": {
                "field1": null,
                "field2": null,
            }
        }
    }

]


#EXPECTED
[
    {
         "id": 1000,
         "field1": null,
         "field2": null,
    },
{
         "id": 1001,
         "field1": null,
         "field1": null,
    },
...
]

CodePudding user response:

It is almost always bad practice to flat a json response, think for example if there is keys duplication in nested object (most of the time the nested object got his own id), how can you solve this conflict?

However if this question is not for production use, and just doing some homework etc. You can use the following code. I tested it on your example:

map = [
    {
        "id": 1000,
        "tablename": {
            "": {
                "field1": None,
                "field2": None,
                "field3": None,
                "field4": None,
                "field5": None,
                "field6": None,
                "field7": None,
                "field8": None,
                "field9": None
            }
        }
    }
]
new_map = {}


def flat_json_object(obj):
    for k, v in obj.items():
        if type(v) is dict:
            flat_json_object(v)
        else:
            new_map[k] = v


def flat_json_array(array):
    for obj in array:
        flat_json_object(obj)


flat_json_array(map)

CodePudding user response:

Awesome problem you got there! I am told you're a new contributor. That's good to hear cause you are the first person I am going to answer here as well. Hope you like it.

Looks like something not too hard what I would recommend is the following.

import pandas as pd

your_list = [
    {
        "id": 1000,
        "tablename": {
            "tableZero": {
                "field0": None,
                "field1": None,
                "field2": None,
            }
        }
    }
]

df = pd.json_normalize(your_list, sep='_')

print(df.to_dict(orient='records')[0])

This will output the following dict.

your_result = {
    "id": 1000,
    "tablename_tableZero_field0": None,
    "tablename_tableZero_field1": None,
    "tablename_tableZero_field2": None,
}

If you like the answer please don't hesitate to verify it!

CodePudding user response:

UPDATED: Below code can work for you.

Note: I have modified your input data from null to None

list = [
    {
        "id": 1000,
        "tableName": {
            "": {
                "field1": None,
                "field2": None,
            }
        }
    },
{
        "id": 1001,
        "tableNameTwo": {
            "": {
                "field1": None,
                "field2": None,
            }
        }
    }

]

flatList = []
for i in list:
    tempDict = {}
    for key, value in i.items():
        if key.startswith('table'):
            for m, n in value[''].items():
                tempDict.update({m: n})
        else:
            tempDict.update({key:value})

    flatList.append(tempDict)

print(flatList)

Output:

[
  {
    'id': 1000,
    'field1': None,
    'field2': None
  },
  {
    'id': 1001,
    'field1': None,
    'field2': None
  }
]

if there are any flaws in the code, contributors please add a comment and the answer can be updated.

  • Related