I have the following string in form of HTML Text -
=09=09=09=09=09=09=09=09=09href=3D"https://web.company.net/#/set-passwor=
d?token=3DeyJ0eFtehgeRs1QiLLKDHDSiOiJIUzI1NiJ9.eyJ1dWlkIjoiZTM4ZTMwMGItNWM5=
Zi00MTg1LThhY2EtNTVkYWEwYzE1ZjIzIiwidGVybXNfYWNjZXBBDHSdndsuOIiwiaWF0IjoxNjQ=
0MzI0NDk3LCJleHAiOjE2NDQ0OTcyOTcsIm5iZiI6MTY0NDMyNDU1N30.lRetwJcg9Gf6upQYx6_=
RonwGWjAcxvkE3szhj5Akxbrk®ion=3D"=09=09=09=09=09=09=09=09=09style=3D"height: 40px; v-text-anchor: middle; wi=
dth: 300px"
The following string(dynamic token) needs to be extracted-
3DeyJ0eFtehgeRs1QiLLKDHDSiOiJIUzI1NiJ9.eyJ1dWlkIjoiZTM4ZTMwMGItNWM5=
Zi00MTg1LThhY2EtNTVkYWEwYzE1ZjIzIiwidGVybXNfYWNjZXBBDHSdndsuOIiwiaWF0IjoxNjQ=
0MzI0NDk3LCJleHAiOjE2NDQ0OTcyOTcsIm5iZiI6MTY0NDMyNDU1N30.lRetwJcg9Gf6upQYx6_=
RonwGWjAcxvkE3szhj5Akxbrk
How can this be done using Regex?
Actively looking forward to some suggestions.
Thank you
CodePudding user response:
Can use MULTILINE flag with the regexp 'token=([^&] )'.
Try this:
import re
html = '''
=09=09=09=09=09=09=09=09=09href=3D"https://web.company.net/#/set-passwor=
d?token=3DeyJ0eFtehgeRs1QiLLKDHDSiOiJIUzI1NiJ9.eyJ1dWlkIjoiZTM4ZTMwMGItNWM5=
Zi00MTg1LThhY2EtNTVkYWEwYzE1ZjIzIiwidGVybXNfYWNjZXBBDHSdndsuOIiwiaWF0IjoxNjQ=
0MzI0NDk3LCJleHAiOjE2NDQ0OTcyOTcsIm5iZiI6MTY0NDMyNDU1N30.lRetwJcg9Gf6upQYx6_=
RonwGWjAcxvkE3szhj5Akxbrk®ion=3D"=09=09=09=09=09=09=09=09=09style=3D"height: 40px; v-text-anchor: middle; wi=
dth: 300px"
'''
m = re.search(r'token=([^&] )', html, re.MULTILINE)
if m:
print(m.group(1))
Output:
3DeyJ0eFtehgeRs1QiLLKDHDSiOiJIUzI1NiJ9.eyJ1dWlkIjoiZTM4ZTMwMGItNWM5=
Zi00MTg1LThhY2EtNTVkYWEwYzE1ZjIzIiwidGVybXNfYWNjZXBBDHSdndsuOIiwiaWF0IjoxNjQ=
0MzI0NDk3LCJleHAiOjE2NDQ0OTcyOTcsIm5iZiI6MTY0NDMyNDU1N30.lRetwJcg9Gf6upQYx6_=
RonwGWjAcxvkE3szhj5Akxbrk
CodePudding user response:
A variation of @CodeMonkey answer without the need of re.MULTILINE
import re
html = '''
=09=09=09=09=09=09=09=09=09href=3D"https://web.company.net/#/set-passwor=
d?token=3DeyJ0eFtehgeRs1QiLLKDHDSiOiJIUzI1NiJ9.eyJ1dWlkIjoiZTM4ZTMwMGItNWM5=
Zi00MTg1LThhY2EtNTVkYWEwYzE1ZjIzIiwidGVybXNfYWNjZXBBDHSdndsuOIiwiaWF0IjoxNjQ=
0MzI0NDk3LCJleHAiOjE2NDQ0OTcyOTcsIm5iZiI6MTY0NDMyNDU1N30.lRetwJcg9Gf6upQYx6_=
RonwGWjAcxvkE3szhj5Akxbrk®ion=3D"=09=09=09=09=09=09=09=09=09style=3D"height: 40px; v-text-anchor: middle; wi=
dth: 300px"
'''
m = re.search(r'token=([.\s\S] ?(?=&))', html, re. )
if m:
print(m.group(1))