Home > Back-end >  Splitting tuples of different lengths to columns in Pandas DF
Splitting tuples of different lengths to columns in Pandas DF

Time:04-17

I have a dataframe that looks like this

id human_id
1 ('apples', '2022-12-04', 'a5ted')
2 ('bananas', '2012-2-14')
3 ('2012-2-14', 'reda21', 'ss')
.. ..

I would like a "pythonic" way to have such output

id human_id col1 col2 col3
1 ('apples', '2022-12-04', 'a5ted') apples 2022-12-04 a5ted
2 ('bananas', '2012-2-14') bananas 2022-12-04 np.NaN
3 ('2012-2-14', 'reda21', 'ss') 2012-2-14 reda21 ss
import pandas as pd

df['a'], df['b'], df['c'] = df.human_id.str

The code I have tried give me error:

ValueError: not enough values to unpack (expected 2, got 1) Python

How can I split the values in tuple to be in columns?

Thank you.

CodePudding user response:

You can do

out = df.join(pd.DataFrame(df.human_id.tolist(),index=df.index,columns=['a','b','c']))

CodePudding user response:

You can do it this way. It will just put None in places where it couldn't find the values. You can then append the df1 to df.


d = {'id': [1,2,3],
     'human_id': [('apples', '2022-12-04', 'a5ted'), ('bananas', '2012-2-14'), ('2012-2-14', 'reda21', 'ss')]}

df = pd.DataFrame(data=d)

df1 = pd.DataFrame(list(df['human_id']), columns=['col1', 'col2', 'col3'])
print(df1)

Output


        col1        col2   col3
0     apples  2022-12-04  a5ted
1    bananas   2012-2-14   None
2  2012-2-14      reda21     ss

  • Related