Home > Software engineering >  Python regex: get words that are between given word and the closest character
Python regex: get words that are between given word and the closest character

Time:03-29

I have a dataset that looks like this

ID Details
1 he wants to invest, Project: Emaar, budget []
2 she is interested in renting, Project: W Residence, bedrooms=2
3 wants to sell, Project: Dubai View; callback

I need to extract project name, which is located between a word 'Project:' and closet character (for e.x. , | ;)

So that in the result it looks like this:

ID Details
1 Emaar
2 W Residence
3 Dubai View

CodePudding user response:

If the comma & semi-colon are always at the end of the project name and your projects only have letters & spaces in their names, then you could use this regex:

Project: ([A-Za-z ] )[;,]

Example.

CodePudding user response:

If the pattern is Project: (something) comma or semi-colon, you can use the following RegEx: (?<=Project:\s).*(?=,|;)

  • Related