Home > Back-end >  Pandas extract substring from column of string using regex
Pandas extract substring from column of string using regex

Time:09-29

I have a dataframe with the string column as:

df['C23']

Col1 Col2
11   /*[lion]*/
21   /*[tiger]*/

I need the following:

Col1 Col2
11   lion
21   tiger

I tried the following code:

df['C23'].str.extract(r"/*(.*?)*/')

but it produces empty strings.

CodePudding user response:

You can use

df['result'] = df['C23'].str.extract(r"/\*\[(.*?)]\*/")

The /\*\[(.*?)]\*/ regex matches

  • /\*\[ - /*[ string
  • (.*?) - Group 1: any zero or more chars other than line break chars as few as possible
  • ]\*/ - ]*/ string

CodePudding user response:

Assuming you want to convert /*[lion]*/ to lion and all elements follow the same pattern, you do not need a regex, just slice:

df['Col2'] = df['Col2'].str[3:-3]
  • Related