Home > Net >  Extracting string using regex in python
Extracting string using regex in python

Time:10-12

I have a list of file paths and want to extract string that appears after "hone/" and "-"

For e.g if the string is 'abfss://[email protected]/alicona/hone/ 120009163_6722508_.csv' then i would like to extract '120009163' .

Since i have a list of such strings i would want to do this using something in one line or recursive.

I am trying to do this in pyspark.

CodePudding user response:

(?<=hone\/)(.*?)(?=_)

I used _ instead of - to get you the result that you want.

CodePudding user response:

You could use the regex pattern /(\d )\w*\.\w $:

df.select(regexp_extract('path', r'/(\d )\w*\.\w $', 1))
  • Related