I have two equal-length lists, a
and b
:
a = [1, 1, 2, 4, 5, 5, 5, 6, 1]
b = ['a','b','c','d','e','f','g','h', 'i']
I would like to keep only those elements from b
, which correspond to an element in a
appearing for the first time. Expected result:
result = ['a', 'c', 'd', 'e', 'h']
One way of reaching this result:
result = [each for index, each in enumerate(b) if a[index] not in a[:index]]
# result will be ['a', 'c', 'd', 'e', 'h']
Another way, invoking Pandas:
import pandas as pd
df = pd.DataFrame(dict(a=a,b=b))
result = list(df.b[~df.a.duplicated()])
# result will be ['a', 'c', 'd', 'e', 'h']
Is there a more efficient way of doing this for large a
and b
?
CodePudding user response:
You could try if this is faster:
firsts = {}
result = [firsts.setdefault(x, y) for x, y in zip(a, b) if x not in firsts]