Add list elements to end of list of lists-CodePudding

I have a list of values that I want to add its elements to the end of each list in a list of lists. Is there a Pythonic or efficient way to solve this?

For example, given:

x = [['a','b','c'],['d','e','f'],['g','h','i']]
y = [1,2,3]

I would expect: [['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]

I've tried:

list(zip(x,y))

But, this produces:

[(['a', 'b', 'c'], 1), (['d', 'e', 'f'], 2), (['g', 'h', 'i'], 3)]

I can solve it with an inefficient loop like this:

new_data = []
for i,x in enumerate(x):
    x.append(y[i])
    new_data.append(x)

print(new_data)
[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]

CodePudding user response：

To build a new list, I might do:

>>> x = [['a','b','c'],['d','e','f'],['g','h','i']]
>>> y = [1,2,3]
>>> [a   [b] for a, b in zip(x, y)]
[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]

If you want to modify x in place, you don't need to use enumerate, just loop over the zip and append the y-elements to the x-elements:

>>> for a, b in zip(x, y):
...     a.append(b)
...
>>> x
[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]

CodePudding user response：

You can unpack the first list when constructing the sublists using zip():

[[*item1, item2] for item1, item2 in zip(x, y)]

For example:

x = [['a','b','c'],['d','e','f'],['g','h','i']]
y = [1,2,3]

print([[*item1, item2] for item1, item2 in zip(x, y)])

outputs:

[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]

CodePudding user response：

The solutions provided by most here are all very similar in performance, and similar to your own.

The solution by @kellybundy is the only one that stands out and I doubt you'll find a faster one, given how minimal it is and the fact that it already relies on Python's fast internals. (please accept their answer, not this one, if you agree)

Consider:

from copy import deepcopy
from timeit import timeit
from random import choice
from collections import deque

chars = 'abcdefghijkl'

texts = [[choice(chars) for _ in range(3)] for _ in range(1000)]
nums = [n for n in range(1000)]


def combine0(xss, ys):
    return xss  # only here to show the cost of any overhead


def combine1(xss, ys):
    result = []
    for i, xs in enumerate(xss):
        xs.append(ys[i])
        result.append(xs)
    return result


def combine2(xss, ys):
    return [xs   [y] for xs, y in zip(xss, ys)]


def combine3(xss, ys):
    return [[*xs, y] for xs, y in zip(xss, ys)]


def combine4(xss, ys):
    result = []
    for xs, y in zip(xss, ys):
        xs.append(y)
        result.append(xs)
    return result


def combine5(xss, ys):
    deque(map(list.append, xss, ys), 0)
    return xss


assert combine1(deepcopy(texts), nums) == combine2(deepcopy(texts), nums) == combine3(deepcopy(texts), nums) == combine4(deepcopy(texts), nums) == combine5(deepcopy(texts), nums)

for _ in range(10):
    for n, f in enumerate((combine0, combine1, combine2, combine3, combine4, combine5)):
        copies = iter([deepcopy(texts) for _ in range(1000)])
        print(f.__name__, timeit(lambda: f(next(copies), nums), number=1000))

Result:

combine0 0.000204699999812874
combine1 0.08227699999952165
combine2 0.09336890000031417
combine3 0.07344190000003437
combine4 0.0657749000001786
combine5 0.01626650000071095
combine0 0.00023539999983768212
combine1 0.07561700000042038
combine2 0.09280970000054367
combine3 0.09156400000028952
combine4 0.06638789999942674
combine5 0.0177254999998695
combine0 0.00021800000013172394
combine1 0.08468040000025212
combine2 0.09661589999996067
combine3 0.08732049999980518
combine4 0.07385549999980867
combine5 0.015442100000655046
etc.

This shows that there's quite a bit of variation in runtime dependent on all sort of other factors, but there's a clear advantage for combine5, which uses @kellybundy's solution.

The lines with 0 show the performance of the function that does nothing, to show that we're actually measuring the performance of the functions and not just the overhead of the calls etc.

Note: the deepcopys are there to avoid modifying the same list repeatedly and they are created before the test to avoid the creation of copies affecting the measurement.

CodePudding user response：

Potentially more efficient solution using collections.deque and map to quickly run append over the list value pairs:

deque(map(list.append, x, y), 0)

Benchmark (using 1000 times longer outer lists):

189 us  191 us  192 us  with_loop
 77 us   77 us   77 us  with_deque

The 0 btw tells deque to just consume, not store anything, so it has very little constant memory overhead. And it's very fast. That's why it's used in itertools' consume recipe and in more-itertools' consume function.

Benchmark code (Try it online!):

def with_loop(x, y):
    for a, b in zip(x, y):
        a.append(b)

def with_deque(x, y):
    deque(map(list.append, x, y), 0)

from timeit import repeat
from collections import deque

funcs = with_loop, with_deque
tss = [[] for _ in funcs]
for _ in range(20):
    for func, ts in zip(funcs, tss):
        x = [['a','b','c'],['d','e','f'],['g','h','i']]
        y = [1,2,3]
        scale = 1000
        x = [a[:] for _ in range(scale) for a in x]
        y *= scale
        t = min(repeat(lambda: func(x, y), number=1))
        ts.append(t)
for func, ts in zip(funcs, tss):
    print(*('= us ' % (t * 1e6) for t in sorted(ts)[:3]), func.__name__)