How to find indexes of first unique value from a list of dictionary using a key-CodePudding

Assuming a list of dictionaries

l=[
    {"id":1, "score":80, "remarks":"B" },
    {"id":2, "score":80, "remarks":"A" },
    {"id":1, "score":80, "remarks":"C" },
    {"id":3, "score":80, "remarks":"B" },
    {"id":1, "score":80, "remarks":"F" },
]

I would like to find the indexes of the first unique value given a key. So given the list above i am expecting a result of

using_id = [0,1,3]
using_score = [0]
using_remarks = [0,1,2,4]

What makes it hard form me is the list has type of dictionary, if it were numbers i could just use this line

indexes = [l.index(x) for x in sorted(set(l))]

using set() on a list of dictionary throws an error TypeError: unhashable type: 'dict'. The constraints are: Must only use modules that came default with python3.10,the code should be scalable friendly as the length of the list will reach into the hundreds, as few lines as possible is a bonus too :)

Of course there is the brute force method, but can this be made to be more efficient, or use lesser lines of code ?

unique_items = []
unique_index = []


for index, item in enumerate(l, start=0):
    if item["remarks"] not in unique_items:
        unique_items.append(item["remarks"])
        unique_index.append(index)

print(unique_items)
print(unique_index)

CodePudding user response：

You could refigure your data into a dict of lists, then you can use code similar to what you would use for a list of values:

dd = { k : [d[k] for d in l] for k in l[0] }
indexes = { k : sorted(dd[k].index(x) for x in set(dd[k])) for k in dd }

Output:

{'id': [0, 1, 3], 'score': [0], 'remarks': [0, 1, 2, 4]}

CodePudding user response：

Since dict keys are inherently unique, you can use a dict to keep track of the first index of each unique value by storing the values as keys of a dict and setting the index as the default value of a key with dict.setdefault:

for key in 'id', 'score', 'remarks':
    unique = {}
    for i, d in enumerate(l):
        unique.setdefault(d[key], i)
    print(key, list(unique.values()))

This outputs:

id [0, 1, 3]
score [0]
remarks [0, 1, 2, 4]

Demo: https://replit.com/@blhsing/PalatableAnxiousMetric

CodePudding user response：

With functools.reduce:

l=[
    {"id":1, "score":80, "remarks":"B" },
    {"id":2, "score":80, "remarks":"A" },
    {"id":1, "score":80, "remarks":"C" },
    {"id":3, "score":80, "remarks":"B" },
    {"id":1, "score":80, "remarks":"F" },
]

from functools import reduce

result = {}
reduce(lambda x, y: result.update({y[1]['remarks']:y[0]}) \
                     if y[1]['remarks'] not in result else None, \
        enumerate(l), result)

result
# {'B': 0, 'A': 1, 'C': 2, 'F': 4}

unique_items = list(result.keys())
unique_items
# ['B', 'A', 'C', 'F']
unique_index = list(result.values())
unique_index
# [0, 1, 2, 4]

Explanation: the lambda function adds to the dictionary result at each step a list containing index (in l) and id but only at the first occurrence of a given value for remarks.

The dictionary structure for the result makes sense since you're extracting unique values and they can therefore be seen as keys.