Home > Back-end >  String Splitting of an URL which always changes the position of it's values in python
String Splitting of an URL which always changes the position of it's values in python

Time:10-22

I need to split an url which is changing the positions of it's values very oftenly.

for example:- This is the url with three different positions of request token

01:-https://127.0.0.1/?action=login&type=login&status=success&request_token=oCS44HJQT2ZSCGb39H76CjgXb0s2klwA

02:-https://127.0.0.1/?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success

03:-https://127.0.0.1/?&action=login&request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&type=login&status=success

From thses url i need only the value of request token which comes after the '=' with an alphanumeric number like this '43CbEWSxdqztXNRpb2zmypCr081eF92d'.

And to split this url i'm using this code

request_token = driver.current_url.split('=')[1].split('&action')[0]

But it gives me error when the url is not in the specified position.

So can anyone please give me a solution to this url splitting in just a single line in python and it'd be a great blessing for me from my fellow stack members.

Note:- Here i'm using driver.current_url because i'm working in selenium to do the thing.

CodePudding user response:

You can use the urllib.parse module to parse URLs properly.

>>> from urllib.parse import urlparse, parse_qs
>>> url = "?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success"
>>> query = parse_qs(urlparse(url).query)
>>> query['request_token']
['43CbEWSxdqztXNRpb2zmypCr081eF92d']
>>> query['request_token'][0]
'43CbEWSxdqztXNRpb2zmypCr081eF92d'

This handles the actual structure of the URLs and doesn't depend on the position of the parameter or other special cases you'd have to handle in a regex.

CodePudding user response:

Assuming you have the URLs as strings then you could use a regular expression to isolate the request tokens.

import re
urls = ['https://127.0.0.1/?action=login&type=login&status=success&request_token=oCS44HJQT2ZSCGb39H76CjgXb0s2klwA',
        'https://127.0.0.1/?request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&action=login&type=login&status=success',
        'https://127.0.0.1/?&action=login&request_token=43CbEWSxdqztXNRpb2zmypCr081eF92d&type=login&status=success']
for url in urls:
    m = re.match('.*request_token=(.*?)(?:&|$)', url)
    if m:
        print(m.group(1))
  • Related