Will json.dumps(sort_keys=True ..) recursively sort nested dictionaries as well?-CodePudding

I need to ensure that the string produced by json.dumps does not ever change if dictionary keys are reordered.

From testing, passing sort_keys=True does indeed do the trick and it does recursively ensure that fields are sorted.

However the official docs are unclear and ambiguous about the recursive nature/behaviour.

If sort_keys is true (default: False), then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

Should I write my own recursive function to dump keys recursively or rely on python to do it.

import json
a = {
  "one": "one",
  "nested": {
    "two": "two",
    "three": "three",
    "nested": {
      "four": "four",
      "five": "five"
    }
  }
}
a_str = json.dumps(a, sort_keys=True)
print(a_str)

b = {
  "nested": {    
    "three": "three",
    "two": "two",
    "nested": {      
      "five": "five",
      "four": "four"
    }
  },
  "one": "one"
}
b_str = json.dumps(b, sort_keys=True)
print(b_str)

print(a_str == b_str) # prints true
assert  a_str == b_str

assert a_str != json.dumps(b) # Works as sort_keys is False by default

replit

CodePudding user response：

Yes, you can rely on that behavior.

In fact, their statement that you quote:

this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis

would be wrong if sort_keys didn't work recursively on nested JSON objects.

CodePudding user response：

The documentation is pretty clear if you read it in the sense

all dictionaries encoded will have their keys sorted

This is also evident from the source for the JSON encoder; _iterencode_dict is called for each dictionary encountered (even if returned from e.g. the default= callable).

https://github.com/python/cpython/blob/8e75c6b49b7cb8515b917f01b32ece8c8ea2c0a0/Lib/json/encoder.py#L333-L355

def _iterencode_dict(dct, _current_indent_level):
    # ...
    if _sort_keys:
        items = sorted(dct.items())
    else:
        items = dct.items()
    # ...