Home > Blockchain >  Pandas Implode and Create 2 New Column based on list value
Pandas Implode and Create 2 New Column based on list value

Time:06-07

I have pandas data frame like this

Name   Col1
Foo    [A, C, B, D, E]
Bar    [G, M, O]
Baz    [E, B]

I want to change it to:

Name   New_Col1   New_Col2   
Foo    A          C
Foo    C          B
Foo    B          D
Foo    D          E
Bar    G          M
Bar    M          O
Baz    E          B

How can I do that?

CodePudding user response:

Do two explode operations and then concat the results:

s = df.set_index('Name').Col1.str
pd.concat([s[:-1].explode(), 
           s[1:].explode()],
           axis=1)\
   .reset_index()

  Name Col1 Col1
0  Foo    A    C
1  Foo    C    B
2  Foo    B    D
3  Foo    D    E
4  Bar    G    M
5  Bar    M    O
6  Baz    E    B

CodePudding user response:

Using explode and GroupBy.shift:

s = df['Col1'].explode()
df.drop(columns='Col1').join(
 pd.concat([s, s.groupby(level=0).shift(-1)], axis=1)
   .set_axis(['New_Col1', 'New_Col2'], axis=1)
   .dropna()
)

output:

  Name New_Col1 New_Col2
0  Foo        A        C
0  Foo        C        B
0  Foo        B        D
0  Foo        D        E
1  Bar        G        M
1  Bar        M        O
2  Baz        E        B

With the approach, you can easily generalize to more combinations, example with 3:

s = df['Col1'].explode()

n = 3

df.drop(columns='Col1').join(
 pd.concat([s.groupby(level=0).shift(-x) for x in range(n)], axis=1)
   .set_axis([f'New_Col{i 1}' for i in range(n)], axis=1)
   .dropna()
)

output:

  Name New_Col1 New_Col2 New_Col3
0  Foo        A        C        B
0  Foo        C        B        D
0  Foo        B        D        E
1  Bar        G        M        O
2  Baz      NaN      NaN      NaN
  • Related