Home > Net >  remove the duplicate key and values in list of dictionaries and append unique list of values to a ke
remove the duplicate key and values in list of dictionaries and append unique list of values to a ke

Time:09-25

list_of_dict=[{'f_text':'sample', 'symbol':'*', 'f_id':246, 'record_id':'4679', 'flag': 'N'},
{'f_text':'sample', 'symbol':'*', 'f_id':246, 'record_id':'4680', 'flag': 'N'},
{'f_text':'other text', 'symbol':'!#', 'f_id':247, 'record_id':'4678', 'flag': 'N'}]

in the above list of dictionaries the first and second lines has same 'f_id':'246'. so I'm trying to remove duplicate key and values in the dictionary and make 'record_id':['4679','4680'].

I'm expecting the below output

[{'f_text':'sample', 'symbol':'*', 'f_id':246, 'record_id':['4679',4680], 'flag': 'N'},
{'f_text':'other text', 'symbol':'!#', 'f_id':247, 'record_id':'4678', 'flag': 'N'}]

i've tried below code:

for dictionary in list_of_dict:
    if dictionary['record_id'] == int(record_id):
        dictionary['flag'] = 'CHECKED'
    sort_footnote = sorted(get_footnote, key=lmabda k:k['flag'])

please suggest the better solution for my problem

CodePudding user response:

@U12-Forward's answer works only if the input is pre-sorted, with records of the same f_ids already grouped together.

A better-rounded approach that works regardless of the order of the input would be to build a dict that maps f_ids to respective dicts, but convert the record_id value to a list when there are multiple records with the same f_ids:

mapping = {}
for d in list_of_dict:
    try:
        entry = mapping[d['f_id']] # raises KeyError
        entry['record_id'].append(d['record_id']) # raises AttributeError
    except KeyError:
        mapping[d['f_id']] = d
    except AttributeError:
        entry['record_id'] = [entry['record_id'], d['record_id']]
print(list(mapping.values()))

This outputs:

[{'f_text': 'sample', 'symbol': '*', 'f_id': 246, 'record_id': ['4679', '4680'], 'flag': 'N'}, {'f_text': 'other text', 'symbol': '!#', 'f_id': 247, 'record_id': '4678', 'flag': 'N'}]

CodePudding user response:

Try itertools.groupby and group the groups and merge the record_id values:

from itertools import groupby
[{**l[0], 'record_id': [x['record_id'] for x in l]} if len(l:=list(v)) > 1 else l[0] for _, v in groupby(sorted(list_of_dict, key=lambda x: x['f_id']), key=lambda x: x['f_id'])]

Output:

[{'f_text': 'sample', 'symbol': '*', 'f_id': 246, 'record_id': ['4679', '4680'], 'flag': 'N'}, {'f_text': 'other text', 'symbol': '!#', 'f_id': 247, 'record_id': '4678', 'flag': 'N'}]

Or define the lambda once:

from itertools import groupby
func = lambda x: x['f_id']
[{**l[0], 'record_id': [x['record_id'] for x in l]} if len(l:=list(v)) > 1 else l[0] for _, v in groupby(sorted(list_of_dict, key=func), key=func)]
  • Related