Optimization: Remove difference between list of list-CodePudding

I need to know if i can optimize my code, because i feel like it can be.

Here the context:

list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
list2 = [2,3]
for item in list1:
    if item[0] not in list2:
         list1.remove(item)

To get list1 i'm doing this:

list(filter(lambda x: name in x[1].lower(), list_of_items))

So i'm asking if it is really possible to optimize this code ?

Update:

As ask i can't use directly list2 in the lmbda filter because i'm getiing it with:

list2 = set(item[0] for item in list1) - set(object.id for object in list_in_my_bdd)

CodePudding user response：

That looks very much like a job for a dictionary lookup which probably is faster than a O(N^2) list comparison. So if you want to have all entries of list1 whose keys are contained in list2 and can assume that the keys in list_of_items are unique you can do:

list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
dict1 = dict(list1)

list2 = [2,3]
result = [dict1[k] for k in list2]

This requires all keys contained in list 2 to actually be in list1 though. Otherwise there will be None values inside result

Timing:

To reproduce the timings in a notebook:

import numpy as np

list1 = [[i, f"name{i}"] for i in range(10000)]
list2 = np.random.choice(range(10000), 1000).tolist()

dict1 = dict(list1)

%timeit [dict1.get(k) for k in list2]
%timeit [item for item in list1 if item[0] in list2]

>>>10000 loops, best of 5: 141 µs per loop
>>>10 loops, best of 5: 160 ms per loop

So the dictionary lookup is approximately 1000 times faster than the list comprehension.

As @Paul mention this is only faster if the setup of list1 can be replaced by directly using a dictionary:

import numpy as np
list_of_items = [(j, f"name{i}") for j, i in enumerate(np.random.randint(0, 50000, 10000))]
list1 = list(filter(lambda x: "name" in x[1].lower(), list_of_items))
dict1 = dict(filter(lambda x: "name" in x[1].lower(), list_of_items))

CodePudding user response：

You could use a list comprehension instead and only include sub-lists of list1 whose first element is in list2:

[item for item in list1 if item[0] in list2]

CodePudding user response：

what about a list comprehension and set?

list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
good = {2,3}

print([[a,b] for a,b in list1 if a in good])
# [[2, 'name2'], [3, 'name3']]

CodePudding user response：

You could do it like this. Note that the variable names are the same as in your question but list2 is actually a set (for efficiency).

list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
list2 = {2, 3}
list3 = [item for item in list1 if item[0] in list2]
print(list3)

CodePudding user response：

To get list1 i'm doing this:

list(filter(lambda x: name in x[1].lower(), list_of_items))

Why don't you also include the check here:

list(filter(lambda x: name in x[1].lower() and x[0] in list2, list_of_items))