I have a number of JSON responses that looks like below
{
"response": {
"list": [
{
"key1": "val1",
"key2": "val2"
},
{
"key3": "val3",
"key4": ""
}
],
"data": {
"key5": "val5",
"key6": "",
"key7": "val7"
},
"key8": "val8",
"key9": "val9",
"key10": "",
"key10": "val10"
}
}
Some JSON responses may be a completely different structure. Some may have more inner lists and dicts and some may be a simple flat json I need to replace all the empty strings ("") with a Null/None value. I am not sure how to traverse an unknown JSON while replacing the values.
Any help would be appreciated
CodePudding user response:
The simplest solution would be to use the Python jq
bindings, which allow you to write a short filter to process the JSON before decoding it.
response = '''{
"response": {
"list": [
{
"key1": "val1",
"key2": "val2"
},
{
"key3": "val3",
"key4": ""
}
],
"data": {
"key5": "val5",
"key6": "",
"key7": "val7"
},
"key8": "val8",
"key9": "val9",
"key10": "",
"key10": "val10"
}
}'''
>>> jq.all('walk(select(.=="")|=null)', text=response)[0]
{'response': {'list': [{'key1': 'val1', 'key2': 'val2'}, {'key3': 'val3', 'key4': None}], 'data': {'key5': 'val5', 'key6': None, 'key7': 'val7'}, 'key8': 'val8', 'key9': 'val9', 'key10': 'val10'}}
CodePudding user response:
The simplest way without a library is to create and use a recursive function:
def replace_blanks(o):
if isinstance(o, dict):
return {k: replace_blanks(v) for k, v in o.items()}
if isinstance(o, list):
return [replace_blanks(elem) for elem in o]
return None if o == '' else o
Usage:
from pprint import pprint
d = {
"response": {
"list": [
{
"key1": "val1",
"key2": "val2"
},
{
"key3": "val3",
"key4": ""
}
],
"data": {
"key5": "val5",
"key6": "",
"key7": "val7"
},
"key8": "val8",
"key9": "val9",
"key10": "",
"key10": "val10"
}
}
pprint(replace_blanks(d))
Prints:
{'response': {'data': {'key5': 'val5', 'key6': None, 'key7': 'val7'},
'key10': 'val10',
'key8': 'val8',
'key9': 'val9',
'list': [{'key1': 'val1', 'key2': 'val2'},
{'key3': 'val3', 'key4': None}]}}
If your data is a JSON string and you're trying to load it to a Python object, you can also pass the object_hook
argument to json.loads
:
json_string = """
{"response": {"list": [{"key1": "val1", "key2": "val2"}, {"key3": "val3", "key4": ""}],
"data": {"key5": "val5", "key6": "", "key7": "val7"},
"key8": "val8", "key9": "val9", "key10": "val10"}}
"""
data = json.loads(json_string, object_hook=replace_blanks)
pprint(data)