Home > Back-end >  Get URL string from HTML using regex
Get URL string from HTML using regex

Time:03-02

I am trying to extract the verification URL from HTML using regex, but it is not finding any matches.

I have tried different variations but no luck. If someone could guide me in the right direction, it would be awesome.

here is the regex I have built using https://bablosoft.github.io/RegexpConstructor

<a\ href="(https://click\.discord\.com/ls/click\?upn=[\s\S] )">[\s\S] l\ \ \ \ \ \ \ \ \ \ </a>

HTML a tag:

<a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeFE1RlVCKJFF5zAq8ml-2BFh1dq-2FeX22E9yMPFmLMSO5CY89faRWhR5p2f4gO8aFsPDQ3vE2xTBhlJeGEE87p63cgiwNItAlEn-2FJFtj5yFA-2FautLWRGW0IKfxGDMHWrbeey8URRAkOrr-2BYBT-2FT5kcUfa1veZCytU-2B9wHLsndHWdUn8EtOOUBgB5uZfcE9bw5O3ZAt92B28NZ9SYzZ7jGK4jzI-3DRs1A_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FHz3X4EcYlJSJWCiAZzUCcvYmGWiqeBrYsFUdcbj9tutXb-2BG-2FNwkX8P7WS7Q2pJobCyQQOkvVoRrpJG7V58-2FYRxv3qpWOuYkwFmbMMjphGS2i2dLkuOItJwH1cksLhGYWdjznem5YrL4mK3OF3tC-2BV1gMuDy55tbJ5iSEgEEBWZ1ez49a-2FDIIzEHUkcai9UP5w-3D-3D" style="text-decoration:none;line-height:100%;background:#5865f2;color:white;font-family:Ubuntu, Helvetica, Arial, sans-serif;font-size:15px;font-weight:normal;text-transform:none;margin:0px;" target="_blank">
            Verify Email
          </a>

Full HTML:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office"><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><title></title>
  <!--[if !mso]><!-- -->
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <!--<![endif]-->

<style type="text/css">
  #outlook a { padding: 0; }
  .ReadMsgBody { width: 100%; }
  .ExternalClass { width: 100%; }
  .ExternalClass * { line-height:100%; }
  body { margin: 0; padding: 0; -webkit-text-size-adjust: 100%; -ms-text-size-adjust: 100%; }
  table, td { border-collapse:collapse; mso-table-lspace: 0pt; mso-table-rspace: 0pt; }
  img { border: 0; height: auto; line-height: 100%; outline: none; text-decoration: none; -ms-interpolation-mode: bicubic; }
  p { display: block; margin: 13px 0; }
</style>
<!--[if !mso]><!-->
<style type="text/css">
  @media only screen and (max-width:480px) {
    @-ms-viewport { width:320px; }
    @viewport { width:320px; }
  }
</style>
<!--<![endif]-->
<!--[if mso]>
<xml>
  <o:OfficeDocumentSettings>
    <o:AllowPNG/>
    <o:PixelsPerInch>96</o:PixelsPerInch>
  </o:OfficeDocumentSettings>
</xml>
<![endif]-->
<!--[if lte mso 11]>
<style type="text/css">
  .outlook-group-fix {
    width:100% !important;
  }
</style>
<![endif]-->

<!--[if !mso]><!-->
    <link href="https://fonts.googleapis.com/css?family=Ubuntu:300,400,500,700" rel="stylesheet" type="text/css">
    <style type="text/css">

        @import url(https://fonts.googleapis.com/css?family=Ubuntu:300,400,500,700);

    </style>
  <!--<![endif]--><style type="text/css">
  @media only screen and (min-width:480px) {
    .mj-column-per-100, * [aria-labelledby="mj-column-per-100"] { width:100%!important; }
  }
</style>
</head>
<body style="background: #F9F9F9;">
  <div style="background-color:#F9F9F9;"><!--[if mso | IE]>
      <table role="presentation" border="0" cellpadding="0" cellspacing="0" width="640" align="center" style="width:640px;">
        <tr>
          <td style="line-height:0px;font-size:0px;mso-line-height-rule:exactly;">
      <![endif]-->
  <style type="text/css">
    html, body, * {
      -webkit-text-size-adjust: none;
      text-size-adjust: none;
    }
    a {
      color:#1EB0F4;
      text-decoration:none;
    }
    a:hover {
      text-decoration:underline;
    }
  </style>
<div style="margin:0px auto;max-width:640px;background:transparent;"><table role="presentation" cellpadding="0" cellspacing="0" style="font-size:0px;width:100%;background:transparent;" align="center" border="0"><tbody><tr><td style="text-align:center;vertical-align:top;direction:ltr;font-size:0px;padding:40px 0px;"><!--[if mso | IE]>
      <table role="presentation" border="0" cellpadding="0" cellspacing="0"><tr><td style="vertical-align:top;width:640px;">
      <![endif]--><div aria-labelledby="mj-column-per-100"  style="vertical-align:top;display:inline-block;direction:ltr;font-size:13px;text-align:left;width:100%;"><table role="presentation" cellpadding="0" cellspacing="0" width="100%" border="0"><tbody><tr><td style="word-break:break-word;font-size:0px;padding:0px;" align="center"><table role="presentation" cellpadding="0" cellspacing="0" style="border-collapse:collapse;border-spacing:0px;" align="center" border="0"><tbody><tr><td style="width:138px;"><a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeARJoBrGSa2vu41A5vK-2B4ute1kWYI6zNuxQFCVciWW4K6TEd_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FAkkdkJgcbJfSqW3G7PIwW0iDwBj6ngHMb3jpNUVVWN-2FoiD0N5rifXZR-2BuZ40dggmlyOiF01slKhKO45twRgh5yI-2B6RaM8bHbZ9G2qR9YaBAoAjs5won6SN5uA3pNhvrxK8VXeUVtOnahoRh0JoE1bxywCQ5yvaOCDMhZEA4yrSOdvd8YeVaoXfokZxRV5aw9Q-3D-3D" target="_blank"><img alt="" title="" height="38px" src="https://cdn.discordapp.com/email_assets/592423b8aedd155170617c9ae736e6e7.png" style="border:none;border-radius:;display:block;outline:none;text-decoration:none;width:100%;height:38px;" width="138"></a></td></tr></tbody></table></td></tr></tbody></table></div><!--[if mso | IE]>
      </td></tr></table>
      <![endif]--></td></tr></tbody></table></div><!--[if mso | IE]>
      </td></tr></table>
      <![endif]-->
      <!--[if mso | IE]>
      <table role="presentation" border="0" cellpadding="0" cellspacing="0" width="640" align="center" style="width:640px;">
        <tr>
          <td style="line-height:0px;font-size:0px;mso-line-height-rule:exactly;">
      <![endif]--><div style="max-width:640px;margin:0 auto;box-shadow:0px 1px 5px rgba(0,0,0,0.1);border-radius:4px;overflow:hidden"><div style="margin:0px auto;max-width:640px;background:#ffffff;"><table role="presentation" cellpadding="0" cellspacing="0" style="font-size:0px;width:100%;background:#ffffff;" align="center" border="0"><tbody><tr><td style="text-align:center;vertical-align:top;direction:ltr;font-size:0px;padding:40px 50px;"><!--[if mso | IE]>
      <table role="presentation" border="0" cellpadding="0" cellspacing="0"><tr><td style="vertical-align:top;width:640px;">
      <![endif]--><div aria-labelledby="mj-column-per-100"  style="vertical-align:top;display:inline-block;direction:ltr;font-size:13px;text-align:left;width:100%;"><table role="presentation" cellpadding="0" cellspacing="0" width="100%" border="0"><tbody><tr><td style="word-break:break-word;font-size:0px;padding:0px;" align="left"><div style="cursor:auto;color:#737F8D;font-family:Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px;line-height:24px;text-align:left;">
            
  <h2 style="font-family: Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-weight: 500;font-size: 20px;color: #4F545C;letter-spacing: 0.27px;">Hey AnthraX,</h2>
<p>Thanks for registering for an account on Discord! Before we get started, we just need to confirm that this is you. Click below to verify your email address:</p>

          </div></td></tr><tr><td style="word-break:break-word;font-size:0px;padding:10px 25px;padding-top:20px;" align="center"><table role="presentation" cellpadding="0" cellspacing="0" style="border-collapse:separate;" align="center" border="0"><tbody><tr><td style="border:none;border-radius:3px;color:white;cursor:auto;padding:15px 19px;" align="center" valign="middle" bgcolor="#5865f2"><a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeFE1RlVCKJFF5zAq8ml-2BFh1dq-2FeX22E9yMPFmLMSO5CY89faRWhR5p2f4gO8aFsPDQ3vE2xTBhlJeGEE87p63cgiwNItAlEn-2FJFtj5yFA-2FautLWRGW0IKfxGDMHWrbeey8URRAkOrr-2BYBT-2FT5kcUfa1veZCytU-2B9wHLsndHWdUn8EtOOUBgB5uZfcE9bw5O3ZAt92B28NZ9SYzZ7jGK4jzI-3DRs1A_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FHz3X4EcYlJSJWCiAZzUCcvYmGWiqeBrYsFUdcbj9tutXb-2BG-2FNwkX8P7WS7Q2pJobCyQQOkvVoRrpJG7V58-2FYRxv3qpWOuYkwFmbMMjphGS2i2dLkuOItJwH1cksLhGYWdjznem5YrL4mK3OF3tC-2BV1gMuDy55tbJ5iSEgEEBWZ1ez49a-2FDIIzEHUkcai9UP5w-3D-3D" style="text-decoration:none;line-height:100%;background:#5865f2;color:white;font-family:Ubuntu, Helvetica, Arial, sans-serif;font-size:15px;font-weight:normal;text-transform:none;margin:0px;" target="_blank">
            Verify Email
          </a></td></tr></tbody></table></td></tr><tr><td style="word-break:break-word;font-size:0px;padding:30px 0px;"><p style="font-size:1px;margin:0px auto;border-top:1px solid #DCDDDE;width:100%;"></p><!--[if mso | IE]><table role="presentation" align="center" border="0" cellpadding="0" cellspacing="0" style="font-size:1px;margin:0px auto;border-top:1px solid #DCDDDE;width:100%;" width="640"><tr><td style="height:0;line-height:0;">&nbsp;</td></tr></table><![endif]--></td></tr><tr><td style="word-break:break-word;font-size:0px;padding:0px;" align="left"><div style="cursor:auto;color:#747F8D;font-family:Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:13px;line-height:16px;text-align:left;">
<p>Need help? <a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeNlOcN7VC1Mue2BSa5oqYEdKm-2BPBEvWHLEUfi61TfqfxuvBipSaAkPtkAVPOEnBIzw-3D-3DAwmi_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FDwgAvuv1YbBdpifu41HAMYpKNCG-2B9DJkbHJh6dVJu9B-2Fhq6a9-2Bb-2F-2FfpPxu-2B-2BUFSXxP0C06ezs4FLQUQ-2BE0NSkC92V6ZreqEWrGFcnGoFFl9g-2FnwH216S3C73vNmQNkkSTGucOGa417dqe48fbTsGecU3qOkpa4YGVUr7jzsP0-2BfU7l1fARgfixc2v61fVrGOA-3D-3D" style="font-family: Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;color: #5865f2;">Contact our support team</a> or hit us up on Twitter <a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeHLasbud5D3vi74o1Q-2B2VLcLLCDOodJpEqop-2Fc-2F5Wr6ZmwhT_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FCyF1AhUtrR4uuth-2FcXmIntavlMvRzaAnv5wOU2OgE73LslLWOQr-2BreJd6EODqDvhDW1Vfkwz9tDF4v7vXkggW4RizSDgglIdEZOfSS5bC9MIP1w2aHx4hNAZGzbL8fgALc-2BVlS5PQRUTo78YViVgjh0OyWMiwTywgQ92JJlzjFBxH5kaGO4DW5oRpchrhSP6Q-3D-3D" style="font-family: Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;color: #5865f2;">@discord</a>.<br>
Want to give us feedback? Let us know what you think on our <a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeGtifxhyb-2FEeTgeN31uAkBS2ZTvlNepPcLUnXgSC4-2BGKK50d_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FOwEWSPbKMa2qo4v5D1DBjT-2BpfsmGt5hJSd6hBWtFxU1Qzs9MC77TBUFgP1VMuRMncLLS0v1dNVKuZKyBi4570KK7y92u7ySo8jc4IiehMbn9LlE9OLAvB6L-2FwBsnarvlcFcWZzoF-2FXUfqkzKAeidv8bjp2uyknoeeCJsueK4IYEBV0HLWkXZU0Ju7VWqIHdSA-3D-3D" style="font-family: Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;color: #5865f2;">feedback site</a>.</p>

</div></td></tr></tbody></table></div><!--[if mso | IE]>
      </td></tr></table>
      <![endif]--></td></tr></tbody></table></div><!--[if mso | IE]>
      </td></tr></table>
      <![endif]-->
      <!--[if mso | IE]>
      <table role="presentation" border="0" cellpadding="0" cellspacing="0" width="640" align="center" style="width:640px;">
        <tr>
          <td style="line-height:0px;font-size:0px;mso-line-height-rule:exactly;">
      <![endif]--></div><div style="margin:0px auto;max-width:640px;background:transparent;"><table role="presentation" cellpadding="0" cellspacing="0" style="font-size:0px;width:100%;background:transparent;" align="center" border="0"><tbody><tr><td style="text-align:center;vertical-align:top;direction:ltr;font-size:0px;padding:20px 0px;"><!--[if mso | IE]>
      <table role="presentation" border="0" cellpadding="0" cellspacing="0"><tr><td style="vertical-align:top;width:640px;">
      <![endif]--><div aria-labelledby="mj-column-per-100"  style="vertical-align:top;display:inline-block;direction:ltr;font-size:13px;text-align:left;width:100%;"><table role="presentation" cellpadding="0" cellspacing="0" width="100%" border="0"><tbody><tr><td style="word-break:break-word;font-size:0px;padding:0px;" align="center"><div style="cursor:auto;color:#99AAB5;font-family:Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:12px;line-height:24px;text-align:center;">
      Sent by Discord •
      <a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeHN2Bg5UBi6nJegCqE7rzswec30BdfDZLIuq6fJ2wlEZPvqD_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FL8CYkuJZnwLKtco9SrpLcLgA4YXFbKgXYAzJ83G0yx55HBRM5IicBhzYhwNCgKYagMPnI5hSbCVrLmztcmoKV7BIs17FkQT1VXmNDPHGJKtwwJ5xGBwW1ojGi6aLanc2sj6Yd6GCjtCOIdWjar4ri9tBl2E7s5RkDn-2BqaSwkRSOLYM8KRHgmD2qV044iH0Esw-3D-3D" style="color:#1EB0F4;text-decoration:none;">check our blog</a>
      • <a href="https://click.discord.com/ls/click?upn=qDOo8cnwIoKzt0aLL1cBeHLasbud5D3vi74o1Q-2B2VLcLLCDOodJpEqop-2Fc-2F5Wr6Z7ZFk_qHrw4GmpuuXmFxW5rh3mZJII65iIReHI98SsrMc1mgu8ShbSoZ8MmrOuOGGsGiDHIxAufqB6YPIqUicSQdng-2FHj3YsAN2EBVLb5jbZZZUDt9HBzm-2Bp8oPeOp5-2Bk3vVtUp1nHcS58a6EY-2F-2BLnmMLQf1jCAmQ-2BYiGDf2nL9lh7tCPObTHD0FCkPEEA7wXwLEAa4Dsf6V4PS8k-2FsDEZMjz6zyk9zxHSteipzFL-2F8AsexwKfN3NvafYCCwCKJLu2V7cEbbiR56xGx3Bm7QlMM1-2FLjg-3D-3D" style="color:#1EB0F4;text-decoration:none;">@discord</a>
    </div></td></tr><tr><td style="word-break:break-word;font-size:0px;padding:0px;" align="center"><div style="cursor:auto;color:#99AAB5;font-family:Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:12px;line-height:24px;text-align:center;">
      444 De Haro Street, Suite 200, San Francisco, CA 94107
    </div></td></tr><tr><td style="word-break:break-word;font-size:0px;padding:0px;" align="left"><div style="cursor:auto;color:#000000;font-family:Whitney, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:13px;line-height:22px;text-align:left;">
      <img src="https://discord.com/api/science/948442403696681040/1458e1ec-fb58-4ac6-bdd1-da8d9d76387b.gif?properties=eyJlbWFpbF90eXBlIjogInVzZXJfdmVyaWZ5X2VtYWlsIn0=" width="1" height="1">
    </div></td></tr></tbody></table></div><!--[if mso | IE]>
      </td></tr></table>
      <![endif]--></td></tr></tbody></table></div><!--[if mso | IE]>
      </td></tr></table>
      <![endif]--></div>
<img src="https://click.discord.com/wf/open?upn=cFl-2B-2BbWYnK3zgfRShQl6Yk5t12LTKds7e07WNeUbpR3c1C9XFTGrMxjdlnrlwNoqz2BHpDIYmuxrcq1ieER-2FBSNVPnAAz9AVFigEw7ot3H7Z0E05DC7oojluNUnOdbMLy9f-2FNqSc-2BH7xVp2afvaRzdq47AooOBVdh3Ely4GwSZS1l9LNxz7PIwbuYUxsC3A-2BVsETVHrlZbqAebXnOYKJjzlQnsX9RGPpPmfcQs2bAdVXccLQxukf5K4sEfzJ4hT-2FaJ7o33yHgShdvVvp3Z99CA-3D-3D" alt="" width="1" height="1" border="0" style="height:1px !important;width:1px !important;border-width:0 !important;margin-top:0 !important;margin-bottom:0 !important;margin-right:0 !important;margin-left:0 !important;padding-top:0 !important;padding-bottom:0 !important;padding-right:0 !important;padding-left:0 !important;">

</body></html>

Thanks!

CodePudding user response:

/href="(.*upn=.*)" style.*?>\s Verify/i

https://regex101.com/r/bsNXZg/2

  • Related