Home > Enterprise >  Why should I sort before taking a random sample?
Why should I sort before taking a random sample?

Time:12-24

Python just gave me weird advice:

>>> import random
>>> random.sample({1: 2, 3: 4, 5: 6}, 2)
Traceback (most recent call last):
  File "<pyshell#11>", line 1, in <module>
    random.sample({1: 2, 3: 4, 5: 6}, 2)
  File "C:\Users\*****\AppData\Local\Programs\Python\Python310\lib\random.py", line 466, in sample
    raise TypeError("Population must be a sequence.  For dicts or sets, use sorted(d).")
TypeError: Population must be a sequence.  For dicts or sets, use sorted(d).

Note the second part of the last error line. Why should I sort if I'm going to randomize a moment later anyway? Seems like wasting O(n log n) time.

CodePudding user response:

From the commit history of cpython - My emphasis: github

In the future, the population must be a sequence. Instances of
:class:set are no longer supported. The set must first be converted to a :class:list or :class:tuple, preferably in a deterministic order so that the sample is reproducible.

If you don't care about reproducibility sorting is not necessary.

  • Related