Home > Enterprise >  How to split column substrings into specific columns
How to split column substrings into specific columns

Time:01-27

I have a dataframe as below:

 -------- 
|     Key|
 -------- 
|  x10x60|
|x1x19x33|
|   x12x6|
|   a14x4|
|x1x1x1x6|
|x2a23x30|
 --------|

And I want the output like this: The Key column can be divide by each x element and put into xa/xb/xc/xd by order, but if there are a elements then place it into ta/tb/tc/td by order.

 -------- ----- ------ ----- ----- ----- ---- ---- ----- 
|     Key|   xa|    xb|   xc|   xd|   ta|  tb|  tc|   td|
 -------- ----- ------ ----- ----- ----- ---- ---- ----- 
|  x10x60|  x10|   x60|     |     |     |    |    |     |
|x1x19x33|   x1|   x19|  x33|     |     |    |    |     |
|   x12x6|  x12|    x6|     |     |     |    |    |     |
|   a14x4|     |    x4|     |     |  a14|    |    |     |
|x1x1x1x6|   x1|    x1|   x1|   x6|     |    |    |     |
|x2a23x30|   x2|      |  x30|     |     | a23|    |     |
 --------|----- ------ ----- ----- ----- ---- ---- ----- 

I tried substr() or substring() cannot have the output, but seems stop at the divide.

CodePudding user response:

Normally you can just use Series.str.split with expand=True, and pandas will auto-expand the substrings into columns.

But since you want to place the substrings into very specific columns, use Series.str.extractall:

import string

m = df['Key'].str.extractall(r'([xa][0-9] )').reset_index()            # match the x* and a* substrings
m['match'] = m['match'].map(dict(enumerate(string.ascii_lowercase)))   # map 0,1,2... -> a,b,c...
m['match'] = np.where(m[0].str.startswith('x'), 'x', 't')   m['match'] # add x or t prefix

out = df[['Key']].join(
    m.pivot(index='level_0', columns='match', values=0)                # reshape into wide form
     .sort_index(key=lambda c: c.str.startswith('t'), axis=1)          # put t columns at end
     .fillna('')
)

Output:

>>> print(out)

        Key   xa   xb   xc  xd   ta   tb
0    x10x60  x10  x60                   
1  x1x19x33   x1  x19  x33              
2     x12x6  x12   x6                   
3     a14x4        x4           a14     
4  x1x1x1x6   x1   x1   x1  x6          
5  x2a23x30   x2       x30           a23
  • Related