I have a JSON file which contains duplicate records and I need to send the unique records to another application. In order to get the unique records, first I need to merge JRNAL_NO & JRNAL_LINE fields separated a hyphen (ex: 655-1) and then use this key to identify the unique records. Is this possible to do via a python script?
Thank you, John
CodePudding user response:
I managed to get the expected results using the below script (thanks to some old posts in SO). I am new to Python, so if there are better ways of doing this, kindly suggest. Thanks.
import json
source = json.loads(input_var)
target = []
seen = set()
for record in source:
name = record['JRNAL_NO'] record['JRNAL_LINE']
if name not in seen:
seen.add(name)
target.append(record)
del seen
output_var = json.dumps(target)
CodePudding user response:
In order to solve this problem, we can gather the dictionaries in a list and then apply a list-comprehension to eliminate duplicates:
dictA = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "1", "D_C": "C"}
dictB = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "3", "D_C": "C"}
dictC = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "3", "D_C": "C"}
dictD = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "3", "D_C": "C"}
list_of_dicts = [dictA, dictB, dictC, dictD]
result = []
[result.append(x) for x in list_of_dicts if x not in result]
print(result)
This will return a list of the non-reapeted dictionaries:
[{'PERIOD': '2022007', 'JRNAL_NO': '655', 'JRNAL_LINE': '1', 'D_C': 'C'}, {'PERIOD': '2022007', 'JRNAL_NO': '655', 'JRNAL_LINE': '3', 'D_C': 'C'}]
Extra relevant links: