Comparing values in a list of dictionaries in python-CodePudding

I am currently trying to do the following:

   def ResultHandler(extractedResult: list):
    jsonObj = {}
    jsonList = []
    for result in extractedResult:
        for key, val in result.items():
            #this works if its hardcoded val to a number...
            if key == "Id" and val == 1:
                jsonObj.update(result)
    jsonList.append(jsonObj)
    return jsonList

I have a list of dictionaries like this {"Id": 1, "title":"example"} and so on. I also have in that same list other dictionaries with associated information like {"Id": 1, "location":"city"}. I want to combine the same Id's together to get {"Id": 1, "title":"example", "location":"city"} for all of the Id's that match. In this case the list is 200 items and its 100 titles with 100 locations all with Id's from 0-99. I want to return a list of 100 combined dictionaries.

CodePudding user response：

Group the dicts by ID. Then merge each group.

from collections import defaultdict

def merge_dicts(dicts):
    grouped = defaultdict(list)
    for d in dicts:
        grouped[d['id']].append(d)

    merged = []
    for k, ds in grouped.items():
        m = {}
        for d in ds:
            m |= d        # If Python < 3.9 : m = {**m, **d}
        merged.append(m)

    return merged

CodePudding user response：

A more functional (but less efficient) approach:

from itertools import groupby
from functools import reduce
from operator import itemgetter


data = [{"Id": 1, "title": "example1"},
        {"Id": 2, "title": "example2"},
        {"Id": 3, "title": "example3"},
        {"Id": 4, "title": "example4"},
        {"Id": 1, "location": "city1"},
        {"Id": 2, "location": "city2"},
        {"Id": 4, "location": "city4"},
        {"Id": 5, "location": "city5"}]


new_data = []
for _, g in groupby(sorted(data, key=itemgetter("Id")), key=itemgetter("Id")):
    new_data.append(reduce(lambda d1, d2: {**d1, **d2}, g))

CodePudding user response：

This function has a nested loop. The outer loop iterates through the list of dictionaries. The inner loop iterates through the list of dictionaries again to check if the id of the current dictionary is already in the list of dictionaries. If it is not, it appends the dictionary to the list of dictionaries. If it is, it updates the dictionary in the list of dictionaries with the contents of the current dictionary.

lst = [
    {"id": 1, "fname": "John"},
    {"id": 2, "name": "Bob"},
    {"id": 1, "lname": "Mary"},
]
def combine_dicts(lst):
    res = []
    for d in lst:
        if d.get("id") not in [x.get("id") for x in res]:
            res.append(d)
        else:
            for r in res:
                if r.get("id") == d.get("id"):
                    r.update(d)
    return res


print(combine_dicts(last))
# output: [{'id': 1, 'fname': 'John', 'lname': 'Mary'}, {'id': 2, 'name': 'Bob'}]

CodePudding user response：

The following code should work:

def resultHandler(extractedResult):
  jsonList = []
  for i in range(len(extractedResult) // 2):
    jsonList.append({"Id": i})
  for i in range(len(extractedResult)):
    for j in range(len(jsonList)):
      if jsonList[j]["Id"] == extractedResult[i]["Id"]:
        if "title" in extractedResult[i]:
          jsonList[j]["title"] = extractedResult[i]["title"];
        else:
          jsonList[j]["location"] = extractedResult[i]["location"];
  return jsonList;

extractedResult = [{"Id": 0, "title":"example1"}, {"Id": 1, "title":"example2"}, {"Id": 0, "location":"example3"}, {"Id": 1, "location":"example4"}]

jsonList = resultHandler(extractedResult)

print(jsonList)

Output:

[{'Id': 0, 'title': 'example1', 'location': 'example3'}, {'Id': 1, 'title': 'example2', 'location': 'example4'}]

This code works by first filling up jsonList with Id values from 0 to half of the length of extractedResult (so the number of IDs).

Then, for every dictionary in extractedResult, we find the dictionary in jsonList with the matching ID. If that dictionary of extractedResult contains a key, "title", then we create that value for that dictionary in jsonList. The same applied for "location".

I hope this helps answer your question! Please let me know if you need any further clarification or details :)

CodePudding user response：

This code will solve your problem in linear time i.e., O(n) where n is the order of growth of the length of your dictionary. It will consider only those Id which has both title and location and will ignore the rest.

from collections import Counter

data = [{"Id": 1, "title":"example1"},
        {"Id": 2, "title":"example2"},
        {"Id": 3, "title":"example3"},
        {"Id": 4, "title":"example4"},
        {"Id": 1, "location":"city1"},
        {"Id": 2, "location":"city2"},
        {"Id": 4, "location":"city4"},
        {"Id": 5, "location":"city5"}]

paired_ids = set([key for key, val in dict(Counter([item["Id"] for item in data])).items() if val == 2]) # O(n)

def combine_dict(data):
    result = {key: [] for key in paired_ids} # O(m), m: number of paired ids (m <= n/2)
    for item in data: # O(n)
        items = list(item.items())
        id, tl, val = items[0][1], items[1][0], items[1][1]

        if id in paired_ids: # O(1), as paired_ids is a set lookup takes O(1)
            result[id].append({tl: val})

    return [{"Id": id, "title": lst[0]["title"], "location": lst[1]["location"]} for id, lst in result.items()] # O(n)


print(*combine_dict(data), sep="\n")

Output:

{'Id': 1, 'title': 'example1', 'location': 'city1'}
{'Id': 2, 'title': 'example2', 'location': 'city2'}
{'Id': 4, 'title': 'example4', 'location': 'city4'}