I was using the standard split operation in python to extract ids from urls. It works for
urls of the form https://music.com/146
where I need to extract 146 but fails in these cases
https://music.com/144?i=150
from where I need to extract 150
after i
I use the standard
url.split("/")[-1]
Is there a better way to do it ?
CodePudding user response:
Python provides a few tools to make this process easier.
As @Barmar mentioned, you can use urlsplit
to split the URL, which gets you a named tuple:
>>> from urllib import parse as urlparse
>>> x = urlparse.urlsplit('https://music.com/144?i=150')
>>> x
SplitResult(scheme='https', netloc='music.com', path='/144', query='i=150', fragment='')
You can use the parse_qs
function to convert the query string into a dictionary:
>>> urlparse.parse_qs(x.query)
{'i': ['150']}
Or in a single line:
>>> urlparse.parse_qs(urlparse.urlsplit('https://music.com/144?i=150').query)['i']
['150']
CodePudding user response:
As @Barmar mentioned, you can fix your code to:
url.split("/")[-1].split("?i=")[-1]
Basically you need to split https://music.com/144?i=150
into https://music.com
and 144?i=150
, get the second element 144?i=150
, then split it to 144
and 150
, then get the second.
If you need it to be number, you can use int(url.split("/")[-1].split("?i="))[-1]
CodePudding user response:
you can use regexp
import re
url = 'https://music.com/144?i=150'
match = re.search(r'(\d )\?', url)
if match:
value = match[1] # 144
if you need the 150
match = re.search(r'i=(\d )', url)
if match:
value = match[1] # 150