input = [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 900.30078125, 'xmax': 1652.0, 'ymax': 967.30078125, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 967.421875, 'xmax': 1652.0, 'ymax': 1035.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1617.0, 'xmax': 1651.0, 'ymax': 1649.1640625, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.1005859375, 'ymin': 1649.2998046875, 'xmax': 1651.0, 'ymax': 1685.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}
Expected output = [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5},{'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}]
Got lil stuck here, please any help is appreciated!
CodePudding user response:
Try this solution:
def filter_by_first_label_occurence(labels):
labels_added = []
filtered_list = []
for label in labels:
if label["label"] not in labels_added:
filtered_list.append(label)
labels_added.append(label["label"])
return filtered_list
label_list = [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 900.30078125, 'xmax': 1652.0, 'ymax': 967.30078125, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 967.421875, 'xmax': 1652.0, 'ymax': 1035.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1617.0, 'xmax': 1651.0, 'ymax': 1649.1640625, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.1005859375, 'ymin': 1649.2998046875, 'xmax': 1651.0, 'ymax': 1685.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}]
filtered_labels = filter_by_first_label_occurence(label_list)
print(filtered_labels)
# >>> [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}]
Hopefully I understood your question properly.
Explaination: It remembers which labels you have added with the labels_added
list. Then, if the label you are checking now is not in
the list, it adds it to both labels_added
and the output filtered_list
. Hope this makes sense