Home > Back-end >  How to sum equal key values when inserting them into a new dictionary in Python?
How to sum equal key values when inserting them into a new dictionary in Python?

Time:09-15

I have the dictionary that I got from a .txt file.

dictOne = {
             "AAA": 0,
             "BBB": 1,
             "AAA": 3,
             "BBB": 1,
           }

I would like to generate a new dictionary called dictTwo with the sum of values of equal keys. Result:

dictTwo = {
             "AAA": 3,
             "BBB": 2,
           }

I prepared the following code, but it points to error syntax (SyntaxError: invalid syntax):

import json

dictOne = json.loads(text)
dictTwo = {}

for k, v in dictOne.items():                  
    dictTwo [k] = v  = v

Can anyone help me what error?

CodePudding user response:

Assuming you resolve the duplicate key issue in dict

dictOne = {
         "AAA": 0,
         "BBB": 1,
         "AAA": 3,
         "BBB": 1
       }

dictTwo = {
         "AAA": 3,
         "BBB": 2,
       }

for k, v in dictOne.items(): 
    if k in dictTwo:
        dictTwo [k]  = v
    else:
        dictTwo[k] = v

print(dictTwo)

CodePudding user response:

You can do this if you do it while reading the JSON input.

JSON permits duplicate keys in objects, although it discourages the practice, noting that different JSON processors produce different results for duplicate keys.

Python does not allow duplicate keys in dictionaries, and Python's json module handles duplicate keys in one of the ways noted by the JSON standard: it ignores all but the last value for any such key. However, it gives you a mechanism to do your own processing of objects, in case you want to do something else with duplicate keys (or produce something other than a dictionary).

You do this by providing the object_pairs_hook parameter to json.load or json.loads. That parameter should be a function whose argument is an iterable of (key, value) pairs, where the key is a string and the value is an already processed JSON object. Whatever the function returns will be the value used by json.load for an object literal; it does not need to return a dict.

That implies that the handling of duplicate keys will be the same for every object literal in the JSON input, which is a bit of a limitation, but it may be acceptable in your case.

Here's a simple example:

import json

def key_combiner(pairs):
    rv = {}
    for k, v in pairs:
        if k in rv: rv[k]  = v
        else: rv[k] = v
    return rv

# Sample usage:
# (Note: JSON doesn't allow trailing commas in objects or lists.)
json_data = '''{
             "AAA": 0,
             "BBB": 1,
             "AAA": 3,
             "BBB": 1
           }'''

consolidated = json.loads(json_data, object_pairs_hook=key_combiner)
print(consolidated)

This prints {'AAA': 3, 'BBB': 2}.

If I'd known that the values were numbers, I could have used a slightly simpler definition using defaultdict. Writing it the way I did permits combining certain other value types, such as strings or arrays, provided that all the values for the same key in an object are the same type. (Unfortunately, it doesn't allow combining objects, because Python uses | to combine two dicts, instead of .)

This feature was mostly intended to be used for creating class instances from json objects, but it has many other possible uses.

  • Related