I have used during a problem this piece of code to retrieve the count of each element in a list :
nums = [1,1,3,2,3,3,4,1,1,4,2,3,3,2,1,3,4,1,2]
print([nums.count(num) for num in set(nums)])
This code works well but doesn't look like to be as efficient as this code :
from collections import Counter
nums = [1,1,3,2,3,3,4,1,1,4,2,3,3,2,1,3,4,1,2]
counter = Counter(nums)
print(counter.values())
Can someone explains to me why is the collections
library is more faster than the vanilla list comprehension ?
CodePudding user response:
The code with Counter
is roughly equivalent to:
def counter_v1(seq):
counts = {}
for x in seq:
counts[x] = counts.get(x, 0) 1
return counts
The code with .count
is roughly equivalent to:
def count(seq, x):
c = 0
for y in seq:
if y == x:
c = 1
return c
def counter_v2(seq):
counts = {}
for x in set(seq):
counts[x] = count(seq, x)
return counts
As you can see, function counter_v1
iterates through the sequence once, with a single for-loop. By contrast, function counter_v2
iterates through the sequence once per distinct element, with a for-loop inside another for-loop.
The runtime of counter_v1
will be roughly proportional to len(seq)
, whereas the runtime of counter_v2
will be roughly proportional to len(seq) * len(set(seq))
, which is usually much bigger.