Let's say that we have this array:
people = [[Amy, 25], [Bella, 30], [Charlie, 29], [Dean, 21], [Elliot, 19]]
And I have a list of names that I want to remove from it:
people_rem = [Amy, Charlie, Dean]
So that our final array will look like this:
final_people = [[Bella, 30], [Elliot, 19]]
I have tried doing this using list comprehension, which works, but it's incredibly slow (not in this specific case, but in my real life usage i have a lot of lists with a lot more items):
final_people = [person for person in people if people[0] not in people_rem]
How would I do this in a way that's efficient and fast?
CodePudding user response:
You are using a data structure that supports only linear lookup. You can use the bisect
module to do logarithmic-time lookup (deletion will still be linear time), but why bother when there is a structure that lets you do constant-time lookup and deletion?
Use a dictionary:
people = dict(people)
Now removal is trivial:
for name in people_rem:
del people[name]
Notice that this runs in O(len(people_rem))
time, not O(len(people))
. Since presumably len(people_rem) < len(people_rem)
, this is a good thing (TM). I'm not counting the O(len(people))
conversion to a dictionary, since you can likely do that directly when you create people
in the first place, making it no more expensive than building the initial list.
CodePudding user response:
Have you tried doing it through pandas? Check if this is faster.
import pandas as pd
people = [['Amy', 25], ['Bella', 30], ['Charlie', 29], ['Dean', 21], ['Elliot', 19]]
people_rem = ['Amy', 'Charlie', 'Dean']
def remove(people, people_rem):
df = pd.DataFrame(people, columns = ['Name', 'Age'])
for person in people_rem:
df.drop(df[df.Name == person].index, inplace=True)
return df.values.tolist()
final_people = remove(people, people_rem)
print(final_people)