Home > Software engineering >  Python path regex optional match
Python path regex optional match

Time:09-23

I have path strings like these two:

tree/bee.horse_2021/moose/loo.se
bee.horse_2021/moose/loo.se
bee.horse_2021/mo.ose/loo.se

The path can be arbitrarily long after moose. Sometimes the first part of the path such as tree/ is missing, sometimes not. I want to capture tree in the first group if it exists and bee.horse in the second.

I came up with this regex, but it doesn't work:

path_regex = r'^(?:(.*)/)?([a-zA-Z] \.[a-zA-Z] ). $'

What am I missing here?

CodePudding user response:

You can restrict the characters to be matched in the first capture group.

For example, you could match any character except / or . using a negated character class [^/\n.]

^(?:([^/\n.] )/)?([a-zA-Z] \.[a-zA-Z] ).*$

Regex demo

Or you can restrict the characters to match word characters \w only

^(?:(\w )/)?([a-zA-Z] \.[a-zA-Z] ).*$

Regex demo

Note that in your pattern, the . at the end matches as least a single character. If you want to make that part optional, you can change it to .*

CodePudding user response:

you are missing the escape character on the \ in the regex it should be

path_regex = r'^(?:(.*)\/)?([a-zA-Z] \.[a-zA-Z] ). $'

This should work tested it here and it works https://regex101.com/r/ea9xZE/1/

  • Related