Home > other >  Filter list using duplicates in another list
Filter list using duplicates in another list

Time:04-12

I have two equal-length lists, a and b:

a = [1, 1, 2, 4, 5, 5, 5, 6, 1]
b = ['a','b','c','d','e','f','g','h', 'i']

I would like to keep only those elements from b, which correspond to an element in a appearing for the first time. Expected result:

result = ['a', 'c', 'd', 'e', 'h']

One way of reaching this result:

result = [each for index, each in enumerate(b) if a[index] not in a[:index]]
# result will be ['a', 'c', 'd', 'e', 'h']

Another way, invoking Pandas:

import pandas as pd
df = pd.DataFrame(dict(a=a,b=b))
result = list(df.b[~df.a.duplicated()])
# result will be ['a', 'c', 'd', 'e', 'h']

Is there a more efficient way of doing this for large a and b?

CodePudding user response:

You could try if this is faster:

firsts = {}
result = [firsts.setdefault(x, y) for x, y in zip(a, b) if x not in firsts]
  • Related