Home > Software engineering >  Extract multiple pattern from string pandas
Extract multiple pattern from string pandas

Time:10-04

Create new column with the extracted middle and last strings from a column within a dataset.

Data

Status             ID
Ok                 hello_dd           
Ok                 hello_aa_now       
No                 standard_cc        
no                 standard_ee_not  

Desired

Status             ID                        type
Ok                 hello_dd                  dd     
Ok                 hello_aa_now              aa
No                 standard_cc               cc
no                 standard_ee_not           ee

Doing

I am able to extract the last string, however, still researching how to extract the middle string.

df['type'] = df['ID'].str.strip('_').str[-1]

Any suggestion is appreciated.

CodePudding user response:

Assuming you want to extract the string after the first _:

df['type'] = df['ID'].str.extract(r'_([^_] )')

With split:

df['type'] = df['ID'].str.split('_').str[1]

output:

  Status               ID type
0     Ok         hello_dd   dd
1     Ok     hello_aa_now   aa
2     No      standard_cc   cc
3     no  standard_ee_not   ee
  • Related