I have the string "/browse/advanced-computer-science-modules?title=machine-learning"** in Python. I want to print the string in between the second "/"
and the "?"
, which is "advanced-computer-science-modules"
.
I've created a regular expression that is as follows ^([a-z]*[\-]*[a-z])*?$
but it prints nothing when I run the .findall()
function from the re module.
I created my own regex and imported the re
module in python. Below is a snippet of my code that returned nothing.
regex = re.compile(r'^([a-z]*[\-]*[a-z])*?$')
str = '/browse/advanced-computer-science-modules?title=machine-learning'
print(regex.findall(str))
CodePudding user response:
Since this appears to be a URL, I'd suggest you use URL-parsing tools instead:
>>> from urllib.parse import urlsplit
>>> url = '/browse/advanced-computer-science-modules?title=machine-learning'
>>> s = urlsplit(url)
SplitResult(scheme='', netloc='', path='/browse/advanced-computer-science-modules', query='title=machine-learning', fragment='')
>>> s.path
'/browse/advanced-computer-science-modules'
>>> s.path.split('/')[-1]
'advanced-computer-science-modules'
CodePudding user response:
The regex is as follows:
\/[a-zA-Z\-] \?
Then you catch the substring:
regex.findall(str)[1:len(str) - 1]
Very specific to this problem, but it should work.
CodePudding user response:
Alternatively, you can use split
method of a string
:
str = '/browse/advanced-computer-science-modules?title=machine-learning'
result = str.split('/')[-1].split('?')[0]
print(result)
#advanced-computer-science-modules