Home > database >  Aggregating rows of lists in pandas
Aggregating rows of lists in pandas

Time:10-21

I have a Series that looks like this

index     column

A         [41, 13, 4, 50]

A         [41, 13, 4, 5]
.
.
.

What I want to do is aggregate the rows of lists, but only the unique values.

The result should look like this:

index     column
    
    A     [41, 13, 4, 50, 5]

.
.
.

Is there any simple way to do this?

CodePudding user response:

(df.explode('column') # explode list to bring values in rows
 .drop_duplicates()  # drop duplicates
 .groupby('index',as_index=False) # group on index
 .agg(list)) # aggregate as list
    index   column
0   A   [41, 13, 4, 50, 5]

if you have a series then

s.explode().drop_duplicates().groupby(level=0).agg(list)
A    [41, 13, 4, 50, 5]
B             [3, 2, 1]
dtype: object

Data used

s=pd.Series([[41, 13, 4, 50], [41, 13, 4, 5], [5,4,3], [5,2,1]], index=['A', 'A', 'B', 'B'])
s
A    [41, 13, 4, 50]
A     [41, 13, 4, 5]
B          [5, 4, 3]
B          [5, 2, 1]
dtype: object

CodePudding user response:

you can also use this:

df.groupby(df.index).column.apply(lambda x: x.explode().unique())

here you first group based on index, then expand everything and return the unique numbers as list

example:

df = pd.DataFrame({'data': [[41, 13, 4, 50], [41, 13, 4, 5], [5,4,3], [5,2,1]]}, index=['A', 'A', 'B', 'B'])

df.groupby(df.index).data.apply(lambda x: x.explode().unique())

enter image description here

  • Related