Say that I have a series of list objects, and another series of values, e.g.,
a = pd.Series([
[1],
[1, 2, 3],
[4, 5],
[],
[2],
[3, 6],
])
b = pd.Series([10, 20, 30, 40, 50, 60])
I wish to append b
element-wise to a
, i.e., to get
result = pd.Series([
[1, 10],
[1, 2, 3, 20],
[4, 5, 30],
[40],
[2, 50],
[3, 6, 60],
])
What is the best way to go about this?
Note, if it would make things easier to have a pd.Series
of pd.array
objects (or something else) instead of list
objects, that's fine.
As a side point, is there any easy way to remove NaN values from the individual lists in a series like a
or result
?
CodePudding user response:
Use apply
to convert each element to a list and then concat:
import pandas as pd
a = pd.Series([
[1],
[1, 2, 3],
[4, 5],
[],
[2],
[3, 6],
])
b = pd.Series([10, 20, 30, 40, 50, 60])
res = a b.apply(lambda x : [x])
print(res)
Output
0 [1, 10]
1 [1, 2, 3, 20]
2 [4, 5, 30]
3 [40]
4 [2, 50]
5 [3, 6, 60]
dtype: object
Alternative:
res = a pd.Series(b.to_frame().to_numpy().tolist())
CodePudding user response:
You can just use zip
, which will in general be faster than apply
:
pd.Series([lst [e] for lst, e in zip(a, b)])
0 [1, 10]
1 [1, 2, 3, 20]
2 [4, 5, 30]
3 [40]
4 [2, 50]
5 [3, 6, 60]
dtype: object
To modify a
inplace:
for lst, e in zip(a, b): lst.append(e)
a
0 [1, 10]
1 [1, 2, 3, 20]
2 [4, 5, 30]
3 [40]
4 [2, 50]
5 [3, 6, 60]
dtype: object
Benchmark against apply
:
timeit("a b.apply(lambda x : [x])", number=100, globals=globals())
0.11637999999948079
timeit("pd.Series([lst [e] for lst, e in zip(a, b)])", number=100, globals=globals())
0.04336999999941327
To remove null
values:
a = pd.Series([
[1],
[1, 2, 3, np.nan],
[4, 5],
[],
[2],
[3, np.nan, 6],
])
a
0 [1]
1 [1, 2, 3, nan]
2 [4, 5]
3 []
4 [2]
5 [3, nan, 6]
dtype: object
a.apply(lambda lst: [x for x in lst if pd.notnull(x)])
0 [1]
1 [1, 2, 3]
2 [4, 5]
3 []
4 [2]
5 [3, 6]
dtype: object