I have a large 1D list arr1
of length 100000 which may contain duplicates and another list arr2
which contains many of the elements in arr1
but cannot have duplicates. I wish to append all the elements of arr1
that are also in arr2
into a third list arr3
:
file = []
with open('input.txt') as inputfile:
for line in inputfile:
file.append(line.strip().split(' '))
arr1 = file[1] # 2nd line of input file
arr2 = file[2] # 3rd line of input file
arr2 = set(arr2)
arr3 = [element for element in arr1 if element in arr2]
Works fine. But when I try:
arr3 = [element for element in arr1 if element in set(arr2)]
as apposed to the last two lines, I would expect the same exact result because they appear to be the same, but it takes forever to run this way. Are these somehow different?
Here is the input file.
CodePudding user response:
the if statement is running on every iteration - thus the conversion to set happens on every iteration.
You need to convert to set before the comparison loop is the solution.