Home > database >  Split list into sublists of length x (with or without overlap)
Split list into sublists of length x (with or without overlap)

Time:04-05

There are many similar questions on here, but I can't find exactly what I'm looking for.

I want to split a list into sublists, each of which is exactly length x. This can include overlap, and the area of overlap doesn't matter so much. For example:

list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

max_len = 3
desired_result = [[1, 2, 3], [3, 4, 5], [6, 7, 8], [8, 9, 10]]

# or 

max_length = 4
desired_result = [[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]

# or 

max_len = 5
desired_result = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]

It doesn't matter how many final sublists there are, though I don't want any more than necessary.

It also doesn't matter where the overlap happens, I just need to capture all the individual items in the original list and have each sublist result in the same number of items.

Thanks!

CodePudding user response:

You can adjust the accepted answer by NedBatchelder in this thread to work for the described scenario.

This is a generator function which I think is a pretty neat solution.

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    # n must not be 0
    for i in range(0, len(lst), n):
        if i   n >= len(lst):
            yield lst[-n:]
        else:
            yield lst[i:i   n]


l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

for i in range(1, 11):
    print(list(chunks(l, i)))

Expected output:

[[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]
[[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [8, 9, 10]]
[[1, 2, 3, 4], [5, 6, 7, 8], [7, 8, 9, 10]]
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6], [5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7], [4, 5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7, 8], [3, 4, 5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7, 8, 9], [2, 3, 4, 5, 6, 7, 8, 9, 10]]
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]

CodePudding user response:

The trick as I see it is to iterate an index in steps of x but "clip" the last one to be no less than x from the end:

>>> a = list(range(1, 11))
>>> x = 3
>>> [a[i:i x] for i in (min(i, len(a) - x) for i in range(0, len(a), x))]
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [8, 9, 10]]

CodePudding user response:

Taken straight from itertools' recipes:

from itertools import zip_longest
def grouper(iterable, n, *, incomplete='fill', fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, fillvalue='x') --> ABC DEF Gxx
    # grouper('ABCDEFG', 3, incomplete='strict') --> ABC DEF ValueError
    # grouper('ABCDEFG', 3, incomplete='ignore') --> ABC DEF
    args = [iter(iterable)] * n
    if incomplete == 'fill':
        return zip_longest(*args, fillvalue=fillvalue)
    if incomplete == 'strict':
        return zip(*args, strict=True)
    if incomplete == 'ignore':
        return zip(*args)
    else:
        raise ValueError('Expected fill, strict, or ignore')

You may then use the "fill" option, and add the last item continuously if you wish to guarantee the size of the sublist:

>>> list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> max_len = 3
>>> list(grouper(list_to_split, max_len, fillvalue=list_to_split[-1]))
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 10, 10)]

CodePudding user response:

Iterate over the list, slicing max_len elements at a time. We start the slice at min(idx, len(list_to_split) - max_len)) in case we're too close to the end of the list:

list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
max_len = 3

result = []
for idx in range(0, len(list_to_split), max_len):
    start = min(idx, len(list_to_split) - max_len)
    result.append(list_to_split[start:start   max_len])

print(result)

You can turn this into a list comprehension, but it's admittedly not very readable:

list_to_split = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
max_len = 3

result = [
    list_to_split[
    min(idx, len(list_to_split) - max_len):
    min(idx, len(list_to_split) - max_len)   max_len]
    for idx in range(0, len(list_to_split), max_len)
]

print(result)

Both of these output:

[[1, 2, 3], [4, 5, 6], [7, 8, 9], [8, 9, 10]]

CodePudding user response:

Following code should do without any libraries, though there are many libraries too that you can use

def main():
    '''The Main'''

    l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    x = 3


    print([l[i:i x] for i in range(0, len(l), x)])


if __name__ == '__main__':
    main()

Output

[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
  • Related