Home > front end >  Group a list of dictionaries by key
Group a list of dictionaries by key

Time:07-23

Current list of dictionaries

[{'N': 2, 'year': 2005},
 {'N': 3, 'year': 2005},
 {'R': 4, 'year': 2005},
 {'R': 5, 'year': 2005},
 {'N': 2, 'year': 2006},
 {'N': 3, 'year': 2006},
 {'R': 4, 'year': 2006},
 {'R': 5, 'year': 2006}]

Trying to achieve this format to have unique key year and append other key values as list.

[{'N': [2, 3], 'R': [4, 5], 'year': 2005},
 {'N': [2, 3], 'R': [4, 5], 'year': 2006}]

CodePudding user response:

You can use itertools.groupby to get groups of values by year, and mutate from there.

import itertools
from operator import itemgetter

d = [{'N': 2, 'year': 2005},
     {'N': 3, 'year': 2005},
     {'R': 4, 'year': 2005},
     {'R': 5, 'year': 2005},
     {'N': 2, 'year': 2006},
     {'N': 3, 'year': 2006},
     {'R': 4, 'year': 2006},
     {'R': 5, 'year': 2006}]

groups = itertools.groupby(sorted(d, key=itemgetter('year')), key=itemgetter('year'))

result = []

for year, group in groups:
    entry = {'year': year, 'N': [], 'R': []}
    result.append(entry)
    for input_entry in group:
        if 'N' in input_entry:
            entry['N'].append(input_entry['N'])
        if 'R' in input_entry:
            entry['R'].append(input_entry['R'])

assert result == [{'N': [2, 3], 'R': [4, 5], 'year': 2005},
                  {'N': [2, 3], 'R': [4, 5], 'year': 2006}]

You can even roll this yourself, though without the itertools magic it will be significantly less performant. Change your data model slightly to construct a dict of dicts so you can key by year and you can do this:

d = [ ... as above ... ]
intermediate = dict()

for entry in d:
    year, n, r = entry['year'], entry.get('N'), entry.get('R')
    if year not in intermediate:
        intermediate[year] = {'N': [], 'R': []}
    if n:
        intermediate[year]['N'].append(n)
    if r:
        intermediate[year]['R'].append(r)

# then mutate back to the shape you need

result = [{'year': key, **value} for key, value in intermediate.items()]

CodePudding user response:

The solution below allows you to use resultDict instead, if you prefer it in a dictionary form. Otherwise, use resultArr.

resultDict = {}
resultArr = []

for item in data:

    # store and remove year
    itemYear = str(item["year"])
    item.pop("year")

    # iterate through each key in the array
    for key in item:
        val = item[key]

        if not itemYear in resultDict:
            resultDict[itemYear] = {"year": itemYear}

        if not key in resultDict[itemYear]:
            resultDict[itemYear][key] = []

        resultDict[itemYear][key].append(val)

for key in resultDict:
    val = resultDict[key]
    resultArr.append(val)

print(resultDict)
print(resultArr)
  • Related