Home > database >  Getting text between html tags
Getting text between html tags

Time:08-26

So basically I have this html and what I want is the text under the tag that has the name:, for example I want this cb6a296b-c7ba-4228-b9f2-d6e39947814e, I've tried using soup but for some reason I always obtain a full html instead of the tags themselves. Is there any way of getting this name cb6a296b-c7ba-4228-b9f2-d6e39947814e?

html:

<td>
 <div>
  <h3>Id:</h3>
  <table style="border: none">
   <tbody>
    <tr>
     <td style="border: none"><b>id:</b></td>
     <td style="border: none"><span style="margin-left: 15px">testuuid1</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>idtype:</b></td>
     <td style="border: none"><span style="margin-left: 15px">uuid</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>territory:</b></td>
     <td style="border: none"><span style="margin-left: 15px">GB</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>type:</b></td>
     <td style="border: none"><span style="margin-left: 15px">cover</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>version:</b></td>
     <td style="border: none"><span style="margin-left: 15px">aa3601f8-219a-43e6-be36-0aa49d2f0943</span></td>
    </tr>
   </tbody>
  </table>
 </div>
 <div>
  <h3>File:</h3>
  <table style="border: none">
   <tbody>
    <tr>
     <td style="border: none"><b>extension:</b></td>
     <td style="border: none"><span style="margin-left: 15px">jpg</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>md5Checksum:</b></td>
     <td style="border: none"><span style="margin-left: 15px">f5e1725f067a697805f4af28bef55720</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>mimeType:</b></td>
     <td style="border: none"><span style="margin-left: 15px">image/jpeg</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>name:</b></td>
     <td style="border: none"><span style="margin-left: 15px">cb6a296b-c7ba-4228-b9f2-d6e39947814e</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>path:</b></td>
     <td style="border: none"><span style="margin-left: 15px"></span></td>
    </tr>
   </tbody>
  </table>
 </div>
 <div>
  <h3>FileInfo:</h3>
  <table style="border: none">
   <tbody>
    <tr>
     <td style="border: none"><b>created:</b></td>
     <td style="border: none"><span style="margin-left: 15px">2022-08-09T17:05:12Z</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>createdBy:</b></td>
     <td style="border: none"><span style="margin-left: 15px">admin</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>expires:</b></td>
     <td style="border: none"><span style="margin-left: 15px">2032-06-26T23:30:00Z</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>updated:</b></td>
     <td style="border: none"><span style="margin-left: 15px">2022-08-09T17:05:14Z</span></td>
    </tr>
    <tr>
     <td style="border: none"><b>updatedBy:</b></td>
     <td style="border: none"><span style="margin-left: 15px">admin</span></td>
    </tr>
   </tbody>
  </table>
 </div></td>

Program:

 val document: Document = Jsoup.parse(requestBody[0])
 val element = document.select("td:contains(name:)").get(0)

CodePudding user response:

You can just give your <span> element an id:

...
<tr>
   <td style="border: none"><b>name:</b></td>
   <td style="border: none"><span id="file-name" style="margin-left: 15px">cb6a296b-c7ba-4228-b9f2-d6e39947814e</span></td>
</tr>
...

And then you can easily access the value of the field in javascript:

<script>
    let fileName = document.getElementById('file-name').innerHTML
    ...
</script>
  • Related