Home > other >  Extract specific data from same table class on java based webpage with Excel VBA
Extract specific data from same table class on java based webpage with Excel VBA

Time:11-16

There is a table on website and code as below.

<div id="InnerContent" style="height: 668px; position: relative; overflow: auto;">
<table class="Template" width="100%">
    <tbody><tr class="Template">
        <td class="View">
            <table class="View" width="100%">
                <tbody><tr class="View">
                    <td class="View">
<p>there are some wording.</p>
  <div class="block-indent">
    <table class="Main">
      <tbody><tr class="Main">
        <td class="Literal"><a href="/default.aspx?Guid=asda334&amp;MenuId=&amp;Action=edit&amp;Reference=saada">1234567</a></td>
        <td class="Literal"><b>Amended</b></td>
        <td class="Literal">(aadasda) Total : 2232<br></td>
      </tr>
      <tr class="Main">
        <td class="Literal"><a href="/default.aspx?Guid=sdfs2323&amp;MenuId=&amp;Action=Edit&amp;Reference=edasd">123123</a></td>
        <td class="Literal"><b>Amended</b></td>
        <td class="Literal">(adasda) Total : 123<br></td>
      </tr>
      <tr class="Main">
        <td class="Literal"><a href="/default.aspx?Guid=12321asada&amp;MenuId=&amp;Action=Edit&amp;Reference=assada">97897</a></td>
        <td class="Literal"><b>Amended</b></td>
        <td class="Literal">(bdfgbgf) Total : 999<br></td>
      </tr>
    </tbody></table>
  </div>
<table class="Main">
 <tbody><tr class="Main">
  <td class="Literal" nowrap="">abc:</td>
  <td class="Field" title=""><span class="String">030</span></td>
  <td class="Literal">&nbsp;&nbsp;&nbsp;</td>
  <td class="Literal" nowrap="">cde:</td>
  <td class="Field" title=""><span class="String">1234567890</span></td>
 </tr>

 <tr class="Main">
  <td class="Literal" nowrap="">Version:</td>
  <td class="Field" title="older Version': 02"><span class="Changed String">03</span></td>
  <td class="Literal"></td>  <td class="Literal" nowrap="">Last Amended:</td>
  <td class="Field" title="'Last Amended': 13 Sep 21"><span class="Changed Date">15 Sep 21</span></td>
  <td class="Literal">&nbsp;&nbsp;&nbsp;</td>
  <td class="Literal" nowrap="">Revised:</td>
  <td class="Field" title=""><span class="String">&nbsp;</span></td>
 </tr>

 <tr class="Main">

  <td class="Literal" nowrap="">Order:</td>
  <td class="Field" title=""><span class="String">A (Amended)</span></td>
  <td class="Literal"></td>
  <td class="Literal" nowrap="">Order2:</td>
  <td class="Field" title=""><span class="String">W (Order)</span></td>
 </tr>

I used to get data from website with using below code before "block-indent" section applied.

Sheets("Sheetname").Range("E5") = ie.document.getElementById("InnerContent").getElementsByClassName("Template")(0).getElementsByClassName("View")(0).getElementsByClassName("Main")(2).getElementsByClassName("Field")(0).innerText

the result was 02 as related with "Version" because it was the second main table result. after adding "block-indent" section to website code, main table quantity is not constant anymore. that means version can be placed in 5th main if block-indent have 3 main table or can be in 6th place in block-indent have 4 main table.

I have tried to get entire table but I can always get only data on "block-indent" section. So How I get data for "version"?

CodePudding user response:

Iterate through the tables to find the one you want.

Option Explicit

Sub demo()

    Dim oDom As Object:
    Set oDom = CreateObject("HtmlFile")
   
    ' read html from file for testing
    Dim fso As Object, ts As Object
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set ts = fso.opentextfile("table.html")
    oDom.body.innerHTML = ts.readall
    ts.Close
    
    '
    Dim tbl As HTMLTable, r As HTMLTableRow
    For Each tbl In oDom.getElementsByTagName("table")
        For Each r In tbl.Rows
            If r.Cells(0).innerText = "Version:" Then
                Debug.Print r.Cells(1).innerText
            End If
            If r.Cells(0).innerText = "abc:" Then
                Debug.Print r.Cells(4).innerText
            End If
        Next
    Next
End Sub
  • Related