Home > OS >  JSON traversal and editing in python
JSON traversal and editing in python

Time:10-22

I have a number of JSON responses that looks like below

{
    "response": {
        "list": [
            {
                "key1": "val1",
                "key2": "val2"
            },
            {
                "key3": "val3",
                "key4": ""
            }
        ],
        "data": {
            "key5": "val5",
            "key6": "",
            "key7": "val7"
        },
        "key8": "val8",
        "key9": "val9",
        "key10": "",
        "key10": "val10"
    }
}

Some JSON responses may be a completely different structure. Some may have more inner lists and dicts and some may be a simple flat json I need to replace all the empty strings ("") with a Null/None value. I am not sure how to traverse an unknown JSON while replacing the values.

Any help would be appreciated

CodePudding user response:

The simplest solution would be to use the Python jq bindings, which allow you to write a short filter to process the JSON before decoding it.

response = '''{
    "response": {
        "list": [
            {
                "key1": "val1",
                "key2": "val2"
            },
            {
                "key3": "val3",
                "key4": ""
            }
        ],
        "data": {
            "key5": "val5",
            "key6": "",
            "key7": "val7"
        },
        "key8": "val8",
        "key9": "val9",
        "key10": "",
        "key10": "val10"
    }
}'''

>>> jq.all('walk(select(.=="")|=null)', text=response)[0]
{'response': {'list': [{'key1': 'val1', 'key2': 'val2'}, {'key3': 'val3', 'key4': None}], 'data': {'key5': 'val5', 'key6': None, 'key7': 'val7'}, 'key8': 'val8', 'key9': 'val9', 'key10': 'val10'}}

CodePudding user response:

The simplest way without a library is to create and use a recursive function:

def replace_blanks(o):
    if isinstance(o, dict):
        return {k: replace_blanks(v) for k, v in o.items()}
    if isinstance(o, list):
        return [replace_blanks(elem) for elem in o]
    return None if o == '' else o

Usage:

from pprint import pprint


d = {
    "response": {
        "list": [
            {
                "key1": "val1",
                "key2": "val2"
            },
            {
                "key3": "val3",
                "key4": ""
            }
        ],
        "data": {
            "key5": "val5",
            "key6": "",
            "key7": "val7"
        },
        "key8": "val8",
        "key9": "val9",
        "key10": "",
        "key10": "val10"
    }
}

pprint(replace_blanks(d))

Prints:

{'response': {'data': {'key5': 'val5', 'key6': None, 'key7': 'val7'},
              'key10': 'val10',
              'key8': 'val8',
              'key9': 'val9',
              'list': [{'key1': 'val1', 'key2': 'val2'},
                       {'key3': 'val3', 'key4': None}]}}

If your data is a JSON string and you're trying to load it to a Python object, you can also pass the object_hook argument to json.loads:

json_string = """
{"response": {"list": [{"key1": "val1", "key2": "val2"}, {"key3": "val3", "key4": ""}],
              "data": {"key5": "val5", "key6": "", "key7": "val7"},
              "key8": "val8", "key9": "val9", "key10": "val10"}}
"""

data = json.loads(json_string, object_hook=replace_blanks)
pprint(data)
  • Related