Home > OS >  Find details of only first label from each "label category"
Find details of only first label from each "label category"

Time:06-21

input = [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 900.30078125, 'xmax': 1652.0, 'ymax': 967.30078125, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 967.421875, 'xmax': 1652.0, 'ymax': 1035.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1617.0, 'xmax': 1651.0, 'ymax': 1649.1640625, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.1005859375, 'ymin': 1649.2998046875, 'xmax': 1651.0, 'ymax': 1685.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}                                                                                                                                                                                                                                              

Expected output = [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5},{'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}]

Got lil stuck here, please any help is appreciated!

CodePudding user response:

Try this solution:

def filter_by_first_label_occurence(labels):
    labels_added = []
    filtered_list = []
    for label in labels:
        if label["label"] not in labels_added:
            filtered_list.append(label)
            labels_added.append(label["label"])
    return filtered_list

label_list = [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 900.30078125, 'xmax': 1652.0, 'ymax': 967.30078125, 'likelihood': 5}, {'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 967.421875, 'xmax': 1652.0, 'ymax': 1035.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1617.0, 'xmax': 1651.0, 'ymax': 1649.1640625, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.1005859375, 'ymin': 1649.2998046875, 'xmax': 1651.0, 'ymax': 1685.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}]
filtered_labels = filter_by_first_label_occurence(label_list)

print(filtered_labels)
# >>> [{'label': 'Accord_row_loc', 'xmin': 48.0, 'ymin': 833.0, 'xmax': 1652.0, 'ymax': 900.3014907836914, 'likelihood': 5}, {'label': 'Accord_row_contact_info', 'xmin': 170.0, 'ymin': 1583.1669921875, 'xmax': 1651.0, 'ymax': 1617.0, 'likelihood': 5}, {'label': 'Accord_row_individuals', 'xmin': 48.0, 'ymin': 1801.0, 'xmax': 1652.0, 'ymax': 1867.0, 'likelihood': 5}]

Hopefully I understood your question properly.

Explaination: It remembers which labels you have added with the labels_added list. Then, if the label you are checking now is not in the list, it adds it to both labels_added and the output filtered_list. Hope this makes sense

  • Related