Create new column with the extracted middle and last strings from a column within a dataset.
Data
Status ID
Ok hello_dd
Ok hello_aa_now
No standard_cc
no standard_ee_not
Desired
Status ID type
Ok hello_dd dd
Ok hello_aa_now aa
No standard_cc cc
no standard_ee_not ee
Doing
I am able to extract the last string, however, still researching how to extract the middle string.
df['type'] = df['ID'].str.strip('_').str[-1]
Any suggestion is appreciated.
CodePudding user response:
Assuming you want to extract
the string after the first _
:
df['type'] = df['ID'].str.extract(r'_([^_] )')
With split
:
df['type'] = df['ID'].str.split('_').str[1]
output:
Status ID type
0 Ok hello_dd dd
1 Ok hello_aa_now aa
2 No standard_cc cc
3 no standard_ee_not ee