Home > OS >  Merge list in multiple columns to a single column in pandas
Merge list in multiple columns to a single column in pandas

Time:10-16

I have a pandas dataframe in the below format:

           0           1        2           3
A.pkl     [121,122]   [123]    [124,125]    [126,127]

The number of columns might be more as well. In the end, I would like to merge all the values in all the columns and write it to a single column.

Result dataframe:

           values          
A.pkl     [121,122,123,124,125,126,127]   

I use the below code to generate the first part:

df = pd.DataFrame({
g: pd.read_pickle(f'{g}')['values'].tolist()
for g in groups
}).T

I tried using itertools.chain but it doesnt seem to do the trick.

Any suggestions would be appreciated.

Input dataframe:

 df = pd.DataFrame({'name': ['aa.pkl'],
               '0': [["001A000001", "003A0025"]],
               '1': [["003B000001","003C000001"]],
               '2': [["003D000001", "003E000001"]],
               '3': [["003F000001", "003G000001"]]})

The above dataframe is generated in the by reading the pickle file

CodePudding user response:

Actually itertools.chain is one way to go, but you have to do it properly:

from itertools import chain
df.apply(lambda x: list(chain(*x)), axis=1)

output:

A.pkl    [121, 122, 123, 124, 125, 126, 127]
dtype: object

As @QuangHoang suggested you can also use the df.sum(axis=1) trick, but be careful, this only works with lists. If for some reason you have numpy arrays this will perform the sum per position ([494, 497]).

Input:

df = pd.DataFrame({'0': [[121, 122]],
                   '1': [[123]],
                   '2': [[124, 125]],
                   '3': [[126, 127]]})
  • Related