Motivation:
A list of [1,2,3,4,5,6,7,8,9,10]
is expressible as range(1,11)
and in
checks are near constant time computations if checked against ranges. Big-ish lists of numbers can be simplified if expressed as list of range()
objects using fewer bytes then used by storing all of them as numbers in a list.
# what I have:
numbers = [1, 2, 5, 7, 8, 9, 10, 13, 14, 15, 42, 43, 44, 45, 46, 47, 48, 49, 50]
# what I want:
ranges = [range(1,3), range(5,6), range(7,11), range(13,16), range(42,51)]
(Self answered) Question:
How can I move from a list of numbers to a list of ranges to facilitate smaller storage and faster in
checks - for consecutive numbers?
Inspired by
- Find index of longest continious sequence in list python
- How to transform a list of numbers into a list of ranges of consecutive numbers (which would take the ranges and string-format them)
- my answer to Concatenate and simplify a list containing number and letter pairs where I could have used this to merge consecutive
ord(letter)
's of"abcd"
into"a-d"
CodePudding user response:
There should be ways to leverage itertools.takewhile or itertools.groupby (using boolean conditional grouping) - the simplest way without any imports I came up with is:
Generator:
def get_consecutive_ranges(numbers: list[int]):
"""Takes an input of integers in a list, yields the
ranges of consecutive numbers that form the list."""
if not numbers:
return []
k = []
start,stop = None,None
for num in numbers:
if stop is None:
start, stop = num, num
elif stop == num-1:
stop = 1
else:
yield range(start,stop 1)
start, stop = num, num
yield range(start,stop 1)
Test:
data = [1, 2, 5, 7, 8, 9, 10, 13, 14, 15, 42, 43, 44, 45, 46, 47, 48, 49,50]
list_of_ranges = list(get_consecutive_ranges(data))
print(list_of_ranges)
to get
[range(1, 3), range(5, 6), range(7, 11), range(13, 16), range(42, 51)]
Application for letter ranges:
letters = "abcghijklmpqrs"
lgroup = [f"{chr(min(r))}-{chr(max(r))}"
for r in get_consecutive_ranges(map(ord, letters))]
print(lgroup)
to get
['a-c', 'g-m', 'p-s'] # from "abcghijklmpqrs"
CodePudding user response:
Another version, using itertools.groupby
:
from itertools import groupby
lst = [1, 2, 5, 7, 8, 9, 10, 13, 14, 15, 42, 43, 44, 45, 46, 47, 48, 49, 50]
out = []
for _, g in groupby(enumerate(lst), lambda k: k[0] - k[1]):
start = next(g)[1]
end = list(v for _, v in g) or [start]
out.append(range(start, end[-1] 1))
print(out)
Prints:
[range(1, 3), range(5, 6), range(7, 11), range(13, 16), range(42, 51)]