String Splitting of an URL which always changes the position of it's values in python-CodePudding

I need to split an url which is changing the positions of it's values very oftenly.

for example:- This is the url with three different positions of request token

01:-https://127.0.0.1/?action=login&type=login&status=success&request_token=oCS44HJQT2ZSCGb39H76CjgXb0s2klwA

02:-https://127.0.0.1/?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success

03:-https://127.0.0.1/?&action=login&request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&type=login&status=success

From thses url i need only the value of request token which comes after the '=' with an alphanumeric number like this '43CbEWSxdqztXNRpb2zmypCr081eF92d'.

And to split this url i'm using this code

request_token = driver.current_url.split('=')[1].split('&action')[0]

But it gives me error when the url is not in the specified position.

So can anyone please give me a solution to this url splitting in just a single line in python and it'd be a great blessing for me from my fellow stack members.

Note:- Here i'm using driver.current_url because i'm working in selenium to do the thing.

CodePudding user response：

You can use the urllib.parse module to parse URLs properly.

>>> from urllib.parse import urlparse, parse_qs
>>> url = "?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success"
>>> query = parse_qs(urlparse(url).query)
>>> query['request_token']
['43CbEWSxdqztXNRpb2zmypCr081eF92d']
>>> query['request_token'][0]
'43CbEWSxdqztXNRpb2zmypCr081eF92d'

This handles the actual structure of the URLs and doesn't depend on the position of the parameter or other special cases you'd have to handle in a regex.

CodePudding user response：

Assuming you have the URLs as strings then you could use a regular expression to isolate the request tokens.

import re
urls = ['https://127.0.0.1/?action=login&type=login&status=success&request_token=oCS44HJQT2ZSCGb39H76CjgXb0s2klwA',
        'https://127.0.0.1/?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success',
        'https://127.0.0.1/?&action=login&request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&type=login&status=success']
for url in urls:
    m = re.match('.*request_token=(.*?)(?:&|$)', url)
    if m:
        print(m.group(1))