Home > OS >  How to group dataframe rows on unique elements in a specific column?
How to group dataframe rows on unique elements in a specific column?

Time:05-31

As an example, how do I convert df to df1, by gathering rows into matrices based on shared values in a specific column tidx?

>>> df = pd.DataFrame({'col3':[[1,40],[2,50],[3,60],[4,70]], 'tidx':[21,22,23,21]})

>>> df['col3'] = df['col3'].apply(np.array)

>>> df
      col3  tidx
0  [1, 40]    21
1  [2, 50]    22
2  [3, 60]    23
3  [4, 70]    21

>>> df1 = pd.DataFrame({'col3':[[[1,40],[4,70]],[[2,50]],[[3,60]]], 'tidx':[21,22,23]})

>>> df1['col3'] = df1['col3'].apply(np.array)


>>> df1
                 col3  tidx
0  [[1, 40], [4, 70]]    21
1           [[2, 50]]    22
2           [[3, 60]]    23


CodePudding user response:

You can use .groupby and then apply list function as shown in example below.

df = pd.DataFrame({'col3':[[1,40],[2,50],[3,60],[4,70]], 'tidx':[21,22,23,21]})
df1 = df.groupby('tidx')['col3'].apply(list).reset_index()
  • Related