Home > Blockchain >  Pandas: how to explode a row of list elements into their own separate columns
Pandas: how to explode a row of list elements into their own separate columns

Time:10-29

I have a Pandas dataframe that looks like this:

import pandas as pd

data = {

'a' : [['Foo', 49.51, -120.69], ['Foo', 49.51, -120.69], ['Foo', 49.51, -120.69], ['Foo', 49.51, -120.69]],
'b' : [['YLK', 44.48, -79.55], ['HG76', 44.60, -65.76], ['DEF', 49.52, -113.99], ['YXZ', 47.96, -84.78]],
'c' : [1628.931942, 1949.748061, 2556.622213, 301.193418]
       }

df = pd.DataFrame(data)
df

    a                       b                       c
0   [Foo, 49.51, -120.69]   [YLK, 44.48, -79.55]    1628.931942
1   [Foo, 49.51, -120.69]   [HG76, 44.6, -65.76]    1949.748061
2   [Foo, 49.51, -120.69]   [DEF, 49.52, -113.99]   2556.622213
3   [Foo, 49.51, -120.69]   [YXZ, 47.96, -84.78]    301.193418

I would like to split out columns a and b such that their elements become their own columns, like this:

    a     b       c         d       e        f         g
0   Foo   49.51   -120.69   YLK     44.48    -79.55    1628.931942
1   Foo   49.51   -120.69   HG76    44.6     -65.76    1949.748061
2   Foo   49.51   -120.69   DEF     49.52    -113.99   2556.622213
3   Foo   49.51   -120.69   YXZ     47.96    -84.78    301.193418

How would I do this?

Thanks!

CodePudding user response:

You can do DataFrame call

df = df.join(pd.DataFrame(df.pop('a').tolist(),index = df.index))
Out[215]: 
                       b            c    0      1       2
0   [YLK, 44.48, -79.55]  1628.931942  Foo  49.51 -120.69
1   [HG76, 44.6, -65.76]  1949.748061  Foo  49.51 -120.69
2  [DEF, 49.52, -113.99]  2556.622213  Foo  49.51 -120.69
3   [YXZ, 47.96, -84.78]   301.193418  Foo  49.51 -120.69

CodePudding user response:

here is one way to do it

df[['a1','a2','a3']]=df['a'].apply(lambda x: ','.join(map(str, x)) ).str.split(',', expand=True)
df[['b1','b2','b3']]=df['b'].apply(lambda x: ','.join(map(str, x)) ).str.split(',', expand=True)
df.drop(columns=['a','b'], inplace=True)
df
              c     a1      a2       a3         b1      b2      b3
0   1628.931942     Foo     49.51   -120.69     YLK     44.48   -79.55
1   1949.748061     Foo     49.51   -120.69     HG76    44.6    -65.76
2   2556.622213     Foo     49.51   -120.69     DEF     49.52   -113.99
3    301.193418     Foo     49.51   -120.69     YXZ     47.96   -84.78

CodePudding user response:

The correct way is to use the .to_list() method of a pandas.Series and then just assign the values to new columns like this:

df[["a1", "a2", "a3"]] = df["a"].to_list()
df[["b1", "b2", "b3"]] = df["b"].to_list()

In one line:

df[["a1", "a2", "a3", "b1", "b2", "b3"]] = (df["a"]   df["b"]).to_list()

If you dont need a and b you can drop it

df.drop(columns=["a", "b"], inplace=True)
  • Related