Appending element-wise a `Series` to an existing `Series` of lists-CodePudding

Say that I have a series of list objects, and another series of values, e.g.,

a = pd.Series([
    [1],
    [1, 2, 3],
    [4, 5],
    [],
    [2],
    [3, 6],
])

b = pd.Series([10, 20, 30, 40, 50, 60])

I wish to append b element-wise to a, i.e., to get

result = pd.Series([
    [1, 10],
    [1, 2, 3, 20],
    [4, 5, 30],
    [40],
    [2, 50],
    [3, 6, 60],
])

What is the best way to go about this?

Note, if it would make things easier to have a pd.Series of pd.array objects (or something else) instead of list objects, that's fine.

As a side point, is there any easy way to remove NaN values from the individual lists in a series like a or result?

CodePudding user response：

Use apply to convert each element to a list and then concat:

import pandas as pd

a = pd.Series([
    [1],
    [1, 2, 3],
    [4, 5],
    [],
    [2],
    [3, 6],
])

b = pd.Series([10, 20, 30, 40, 50, 60])


res = a   b.apply(lambda x : [x])
print(res)

Output

0          [1, 10]
1    [1, 2, 3, 20]
2       [4, 5, 30]
3             [40]
4          [2, 50]
5       [3, 6, 60]
dtype: object

Alternative:

res = a   pd.Series(b.to_frame().to_numpy().tolist())

CodePudding user response：

You can just use zip, which will in general be faster than apply:

pd.Series([lst   [e] for lst, e in zip(a, b)])

0          [1, 10]
1    [1, 2, 3, 20]
2       [4, 5, 30]
3             [40]
4          [2, 50]
5       [3, 6, 60]
dtype: object

To modify a inplace:

for lst, e in zip(a, b): lst.append(e)

a
0          [1, 10]
1    [1, 2, 3, 20]
2       [4, 5, 30]
3             [40]
4          [2, 50]
5       [3, 6, 60]
dtype: object

Benchmark against apply:

timeit("a   b.apply(lambda x : [x])", number=100, globals=globals())
0.11637999999948079

timeit("pd.Series([lst   [e] for lst, e in zip(a, b)])", number=100, globals=globals())
0.04336999999941327

To remove null values:

a = pd.Series([
    [1],
    [1, 2, 3, np.nan],
    [4, 5],
    [],
    [2],
    [3, np.nan, 6],
])

a
0               [1]
1    [1, 2, 3, nan]
2            [4, 5]
3                []
4               [2]
5       [3, nan, 6]
dtype: object

a.apply(lambda lst: [x for x in lst if pd.notnull(x)])

0          [1]
1    [1, 2, 3]
2       [4, 5]
3           []
4          [2]
5       [3, 6]
dtype: object