Home > front end >  An efficient way to aggregate list of dictionaries
An efficient way to aggregate list of dictionaries

Time:01-13

I have a list of python dictionaries and I'm trying to aggregate the keys based on different metrics (max, min).

Right now, I am converting the list of dicts to a pandas dataframe and then using the agg function to return my desired output.

But doing so introduces some time and memory usage. Would appreciate some help in making the run-time more efficient without resorting to pandas.

What I've done so far?

boxes = [{'width': 178.25, 'right': 273.25, 'top': 535.0, 'left': 95.0, 'bottom': 549.0, 'height': 14.0}, {'width': 11.17578125, 'right': 87.17578125, 'top': 521.0, 'left': 76.0, 'bottom': 535.0, 'height': 14.0}, {'width': 230.8515625, 'right': 306.8515625, 'top': 492.0, 'left': 76.0, 'bottom': 506.0, 'height': 14.0}, {'width': 14.65234375, 'right': 90.65234375, 'top': 535.0, 'left': 76.0, 'bottom': 549.0, 'height': 14.0}, {'width': 7.703125, 'right': 83.703125, 'top': 506.0, 'left': 76.0, 'bottom': 520.0, 'height': 14.0}, {'width': 181.8515625, 'right': 276.8515625, 'top': 521.0, 'left': 95.0, 'bottom': 535.0, 'height': 14.0}, {'width': 211.25, 'right': 306.25, 'top': 506.0, 'left': 95.0, 'bottom': 520.0, 'height': 14.0}]
boxes = pd.DataFrame(boxes)
boxes = boxes.agg({'left': min, 'right': max, 'top': min, 'bottom': max})
boxes['height'] = boxes['bottom'] - boxes['top']
boxes['width'] = boxes['right'] - boxes['left']
res = boxes.to_dict()

Desired Result

{'left': 76.0, 'right': 306.8515625, 'top': 492.0, 'bottom': 549.0, 'height': 57.0, 'width': 230.8515625}

CodePudding user response:

Here's one approach:

(i) Use dict.setdefault to merge the dictionaries to create a single one temp

(ii) Traverse temp and apply the functions in functions on the corresponding keys's values.

(iii) 'height' and 'width' are not in functions. Calculate them separately.

functions = {'left': min, 'right': max, 'top': min, 'bottom': max}
temp = {}
for d in boxes:
    for k, v in d.items():
        if k in functions:
            temp.setdefault(k, []).append(v)

out = {k: functions[k](v) for k, v in temp.items()}
out['height'] = out['bottom'] - out['top']
out['width'] = out['right'] - out['left']

Output:

{'width': 230.8515625,
 'right': 306.8515625,
 'top': 492.0,
 'left': 76.0,
 'bottom': 549.0,
 'height': 57.0}
  •  Tags:  
  • Related