Home > Blockchain >  I have a list of numbers that are sometimes consecutive - how to get a list of ranges from it?
I have a list of numbers that are sometimes consecutive - how to get a list of ranges from it?

Time:04-29

Motivation:

A list of [1,2,3,4,5,6,7,8,9,10] is expressible as range(1,11) and in checks are near constant time computations if checked against ranges. Big-ish lists of numbers can be simplified if expressed as list of range() objects using fewer bytes then used by storing all of them as numbers in a list.

# what I have:
numbers = [1, 2, 5, 7, 8, 9, 10, 13, 14, 15, 42, 43, 44, 45, 46, 47, 48, 49, 50] 

# what I want:
ranges = [range(1,3), range(5,6), range(7,11), range(13,16), range(42,51)]

(Self answered) Question:

How can I move from a list of numbers to a list of ranges to facilitate smaller storage and faster in checks - for consecutive numbers?


Inspired by

CodePudding user response:

There should be ways to leverage itertools.takewhile or itertools.groupby (using boolean conditional grouping) - the simplest way without any imports I came up with is:

Generator:

def get_consecutive_ranges(numbers: list[int]):
    """Takes an input of integers in a list, yields the 
    ranges of consecutive numbers that form the list."""
    if not numbers:
        return []

    k = []
    start,stop = None,None
    for num in numbers:
        if stop is None:
            start, stop = num, num
        elif stop == num-1:
            stop  = 1
        else:
            yield range(start,stop 1)
            start, stop = num, num

    yield range(start,stop 1)

Test:

data = [1, 2, 5, 7, 8, 9, 10, 13, 14, 15, 42, 43, 44, 45, 46, 47, 48, 49,50]

list_of_ranges = list(get_consecutive_ranges(data))
print(list_of_ranges)

to get

[range(1, 3), range(5, 6), range(7, 11), range(13, 16), range(42, 51)]

Application for letter ranges:

letters = "abcghijklmpqrs"
lgroup = [f"{chr(min(r))}-{chr(max(r))}" 
          for r in get_consecutive_ranges(map(ord, letters))]
print(lgroup)

to get

['a-c', 'g-m', 'p-s']   # from  "abcghijklmpqrs"

CodePudding user response:

Another version, using itertools.groupby:

from itertools import groupby


lst = [1, 2, 5, 7, 8, 9, 10, 13, 14, 15, 42, 43, 44, 45, 46, 47, 48, 49, 50]

out = []
for _, g in groupby(enumerate(lst), lambda k: k[0] - k[1]):
    start = next(g)[1]
    end = list(v for _, v in g) or [start]
    out.append(range(start, end[-1]   1))

print(out)

Prints:

[range(1, 3), range(5, 6), range(7, 11), range(13, 16), range(42, 51)]
  • Related