Split String in Python with Empty Values-CodePudding

I have a column of strings that include two unique columns - a column of two letters, and then a column of Y or E. In both cases they're not always filled. I'm hoping to split them into two new columns:

import pandas as pd
d={'col1':['AD','Y','E','AD E','WZ E'],
   'col1_new':['AD','','','AD','WZ'],
   'col2_new':['','Y','E','E','E']}
pd.DataFrame(d)

Any tips?

CodePudding user response：

You can use a regex to extract the two parts:

df[['col1_new', 'col2_new']] = (df['col1'].str.extract('(\w{2})?\s*(\w)?')
                                          .fillna('')
                                )

output:

   col1 col1_new col2_new
0    AD       AD         
1     Y                 Y
2     E                 E
3  AD E       AD        E
4  WZ E       WZ        E

regex

(\w{2})?     # capture 2 ({2}) "word" characters (\w), optionally (?)
\s*          # match zero or more (*) spaces (\s)
(\w)?        # capture 1 "word" character, optionally

regex demo