Home > Back-end >  Flattening lists inside a Pandas series
Flattening lists inside a Pandas series

Time:03-23

I have a list:

[[ambulance],[],[]]
[[truck],[bus],[],[company],[ambulance]]
[[bus],[],[],[]]

And I'm trying to clean this to:

[ambulance]
[truck,bus,company,ambulance]
[bus]

I tried list.explode() but still have the empty [] and where there's 2 items it's index in duplicated like:

1 [truck]
1 [bus]
1 []
1 [company]
1 [ambulance]

How can I fix this?

CodePudding user response:

After you explode, use .str[0] to get the first value of each sub-list or NaN if there is none, then dropna, and reconstruct the lists with groupby(level=0) agg(list):

df['l'] = df['l'].explode().str[0].dropna().groupby(level=0).agg(list)

Output:

>>> df
                                  l
0                       [ambulance]
1  [truck, bus, company, ambulance]
2                             [bus]

CodePudding user response:

You can just map sum

df['l'] = df['l'].map(lambda x : sum(x,[]))
  • Related