Home > Blockchain >  Get verification code from a html string code using regex
Get verification code from a html string code using regex

Time:10-28

I am currently writing an automation script, Where I read email Gmail through API and i am getting below html content. Now i need only code 191418 from this html content, I want to take it using regex. I tried with this

.*([0-9]{6}) 

To find 6 digit code but its returns 10 matchings, I am not good at regex, Can someone please help me to get the code using regex?

<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr"><br></div><u></u>
    <div>
        <center id="m_-2051398760120817894wrapper">
            <table id="m_-2051398760120817894main" width="100%">
                <tbody><tr id="m_-2051398760120817894logo">
                    <td>
                        <table width="100%">
                            <tbody><tr>
                                <td>
                                    <img src="test.com/logo.png" width="140px" alt="xxxxx Logo" style="padding:0 10px">
                                </td>
                            </tr>
                        </tbody></table>
                    </td>
                </tr>
                <tr>
                    <td height="18px"></td>
                </tr>
                <tr id="m_-2051398760120817894header">
                    <td>
                        <table width="100%">
                            <tbody><tr>
                                <td height="64px" style="background-color:#10069f;color:#fff;padding-left:24px;font-weight:700">Reset your password</td>
                            </tr>
                        </tbody></table>
                    </td>
                </tr>
                <tr id="m_-2051398760120817894content">
                    <td>
                        <table width="100%">
                            <tbody><tr>
                                <td style="background-color:#f6f5ff;padding:24px 24px 16px 24px">
                                    <p style="margin-top:0">The following is the verification code required to complete your password reset.</p>
                                    <p style="margin-bottom:24px">Enter the following verification code on the screen during the registration, and proceed to the next step.</p>
                                    <div style="display:block;text-align:center;margin-bottom:8px;background-color:#fff;height:92px;font-weight:600;font-size:36px;line-height:92px">191418</div>
                                    <span style="display:block;font-size:12px;color:#5d5d5d">*The verification code is valid only for 24 hours.</span>
                                </td>
                            </tr>
                        </tbody></table>
                    </td>
                </tr>
                <tr>
                    <td height="24px"></td>
                </tr>
                <tr id="m_-2051398760120817894footer">
                    <td>
                        <table width="100%">
                            <tbody><tr>
                                <td style="background-color:#6d7777;padding:16px 24px;font-size:12px;color:#fff">
                                    <table width="100%">
                                        <tbody><tr>
                                            <td id="m_-2051398760120817894footer-left">
                                                <span style="display:block">amnimo Inc.</span>
                                                <span style="display:block">0-3-30 usaa-fso, xxxxxxxx-shi, Tokyo, 180-8750, Japan</span>
                                                <span style="display:block">Phone:  81-422-52-6779</span>
                                                <span id="m_-2051398760120817894copyright-mb" style="margin-top:16px">© 2020 <div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr"><br></div><u></u>
<div>
    <center id="m_-2051398760120817894wrapper">
        <table id="m_-2051398760120817894main" width="100%">
            <tbody><tr id="m_-2051398760120817894logo">
                <td>
                    <table width="100%">
                        <tbody><tr>
                            <td>
                                <img src="https://test.com/logo.png" width="140px" alt="Amnimo Logo" style="padding:0 10px">
                            </td>
                        </tr>
                    </tbody></table>
                </td>
            </tr>
            <tr>
                <td height="18px"></td>
            </tr>
            <tr id="m_-2051398760120817894header">
                <td>
                    <table width="100%">
                        <tbody><tr>
                            <td height="64px" style="background-color:#10069f;color:#fff;padding-left:24px;font-weight:700">Reset your password</td>
                        </tr>
                    </tbody></table>
                </td>
            </tr>
            <tr id="m_-2051398760120817894content">
                <td>
                    <table width="100%">
                        <tbody><tr>
                            <td style="background-color:#f6f5ff;padding:24px 24px 16px 24px">
                                <p style="margin-top:0">The following is the verification code required to complete your password reset.</p>
                                <p style="margin-bottom:24px">Enter the following verification code on the screen during the registration, and proceed to the next step.</p>
                                <div style="display:block;text-align:center;margin-bottom:8px;background-color:#fff;height:92px;font-weight:600;font-size:36px;line-height:92px">191418</div>
                                <span style="display:block;font-size:12px;color:#5d5d5d">*The verification code is valid only for 24 hours.</span>
                            </td>
                        </tr>
                    </tbody></table>
                </td>
            </tr>
            <tr>
                <td height="24px"></td>
            </tr>
            <tr id="m_-2051398760120817894footer">
                <td>
                    <table width="100%">
                        <tbody><tr>
                            <td style="background-color:#6d7777;padding:16px 24px;font-size:12px;color:#fff">
                                <table width="100%">
                                    <tbody><tr>
                                        <td id="m_-2051398760120817894footer-left">
                                            <span style="display:block">test Inc.</span>
                                            <span style="display:block">2-9-32 ssdsa-sss, puakano-shi, Tokyo, 000-8000, Japan</span>
                                            <span style="display:block">Phone:  81-000-00-652</span>
                                            <span id="m_-2051398760120817894copyright-mb" style="margin-top:16px">© 2020 amnimo Inc.</span>
                                        </td>
                                        <td id="m_-2051398760120817894footer-right">
                                            <span style="display:block">© 2020 amnimo Inc.</span>
                                        </td>
                                    </tr>
                                </tbody></table>
                            </td>
                        </tr>
                    </tbody></table>
                </td>
            </tr>
        </tbody></table>
    </center>
</div>
</div></div> Inc.</span>
                                            </td>
                                            <td id="m_-2051398760120817894footer-right">
                                                <span style="display:block">© 2020 test Inc.</span>
                                            </td>
                                        </tr>
                                    </tbody></table>
                                </td>
                            </tr>
                        </tbody></table>
                    </td>
                </tr>
            </tbody></table>
        </center>
    </div>
    </div></div>

CodePudding user response:

Maybe try word boundaries, which will prevent matching inside longer numbers:

\b([0-9]{6})\b

https://regex101.com/r/dQAiHU/1/

CodePudding user response:

You should use some DOM library that will let you query the element you want and get its content. Parsing HTML with regex is bad idea.

If you must do it, getting six numbers is not enough - after inspecting, I see that it's div content. So I would write something among the lines:

<div[^>]*>\d{6}<\/div>

Pattern explanation:

<div - match <div literally

[^>]* - match zero or more characters other from >

> - match > literally

\d{6} - match 6 digits

<\/div> - match <\/div> literally

Regex demo

EDIT

In order to extract desired text, use capturing groups:

<div[^>]*>(\d{6})<\/div>

Then text in first capturing group will be your desired result.

  • Related