Home > Mobile >  How to substring html string based on div id
How to substring html string based on div id

Time:12-25

If I have a string in html format like this:

<p style="text-align:center">&nbsp;</p>
<p style="text-align:center">&nbsp;</p>
<p style="text-align:center"><strong>To The &lrm;<span>Embassy of The United Kingdom</span>&rlm;</strong></p>
<p>The ORG- certifies that &lrm;<strong><span>Mrs.</span></strong>&rlm;&lrm;<strong>&nbsp;</strong>&rlm;&lrm;<strong><span>Matilda Johan</span></strong>&rlm;,</p>
<p>has been&lrm;&rlm;&lrm;&rlm; working since&nbsp;<strong><span>01/10/2003</span></strong>&rlm; until present.</p>
<p>&lrm;<span>Presently, she is working as</span>&rlm;&lrm;&rlm;&lrm;&nbsp; a / an &lrm;<strong><span>JOB TITLE NOT DEFINED</span></strong>&rlm; at&nbsp;<strong><span>Dean of the Faculty of Engineering and Technology Office - College of Engineering and Technology - S</span></strong>&rlm;-&lrm;​​​​​​​&rlm;&lrm;​​​&lrm;<strong><span>College of Engineering and Technology </span></strong>&rlm;.</p>
<p><strong>This certificate was issued upon&nbsp;</strong>&lrm;<strong><span>her request</span></strong>&rlm;​​​​​​​&nbsp;<strong>and without any commitment on behalf of the ORG.</strong></p>
<div>
  <div id="dv_sign_en" style="float:left;clear:both;font-style: italic;">...</div>
  <div></div>
</div>

How to get the string only before the parent div for the div with id start with dv_sign_ so the result will be:

<p style="text-align:center">&nbsp;</p>
<p style="text-align:center">&nbsp;</p>
<p style="text-align:center"><strong>To The &lrm;<span>Embassy of The United Kingdom</span>&rlm;</strong></p>
<p>The ORG- certifies that &lrm;<strong><span>Mrs.</span></strong>&rlm;&lrm;<strong>&nbsp;</strong>&rlm;&lrm;<strong><span>Matilda Johan</span></strong>&rlm;,</p>
<p>has been&lrm;&rlm;&lrm;&rlm; working since&nbsp;<strong><span>01/10/2003</span></strong>&rlm; until present.</p>
<p>&lrm;<span>Presently, she is working as</span>&rlm;&lrm;&rlm;&lrm;&nbsp; a / an &lrm;<strong><span>JOB TITLE NOT DEFINED</span></strong>&rlm; at&nbsp;<strong><span>Dean of the Faculty of Engineering and Technology Office - College of Engineering and Technology - S</span></strong>&rlm;-&lrm;​​​​​​​&rlm;&lrm;​​​&lrm;<strong><span>College of Engineering and Technology </span></strong>&rlm;.</p>
<p><strong>This certificate was issued upon&nbsp;</strong>&lrm;<strong><span>her request</span></strong>&rlm;​​​​​​​&nbsp;<strong>and without any commitment on behalf of the ORG.</strong></p>

CodePudding user response:

    String html;

    using (StreamReader reader = new StreamReader($@"D:\OneDrive\Dokumentumok\Projects\html.txt")) {
        html = reader.ReadToEnd();
    }

    Int32 index = html.IndexOf("<div id=\"dv_sign_");
    html = html.Substring(0, index);

    index = html.LastIndexOf("<div>");
    html = html.Substring(0, index);
  • Related