Home > Software engineering >  Where are the links to the detail pages hidden in this HTML?
Where are the links to the detail pages hidden in this HTML?

Time:06-25

I'm looking at the following county court record:

Search results page

At the top of that display image, you can see the URL of the search facility that produced this:

https://www.evaultla.com/Subscriptions/Search/ascension

but it's behind a $50 paywall. The data in the search results are not found in "View Source", but they are in the DevTools code inspector, as seen at the bottom of the image. So that allows me to get the data from the results table, except for one thing.

When I click anywhere in a row of the results table, that pulls up a new page with additional details. For example, if I click in the first row, I get a page with the title "Details for 1053380" at (also behind the paywall):

https://www.evaultla.com/Subscriptions/detail/ascension/10874511

Now here's the mysterious part. The number at the end of that URL, "10874511", does not appear to be anywhere in the HTML in the code inspector for the search results page. Here is the part of that code for the results table, with the code for rows 2 through 9 elided:

<div id="grid-1010-body" data-ref="body"  role="presentation" style="left: 0px; width: 661px; height: 385px; top: 110px;">
    <div  role="rowgroup" id="gridview-1012" tabindex="0" style="overflow: hidden auto; margin: 0px; width: 661px; height: 384px;" data-componentid="gridview-1012">
        <div  role="presentation" id="ext-element-5" style="transform: translate3d(660px, 368px, 0px); line-height: 1px;"></div>
        <div  tabindex="0"></div>
        <div  role="presentation" style="width: 661px; transform: translate3d(0px, 0px, 0px);">
            <table id="gridview-1012-record-19" role="presentation" data-boundview="gridview-1012" data-recordid="19" data-recordindex="0"  style=";width:0" cellspacing="0" cellpadding="0">
                <tbody>
                    <tr  role="row">
                        <td  style="width:80px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1013">
                            <div unselectable="on"  style="text-align:left;">1053380</div></td>
                        <td  style="width:90px;" role="gridcell" tabindex="-1" data-columnid="datecolumn-1014">
                            <div unselectable="on"  style="text-align:left;">05/20/2022</div></td>
                        <td  style="width:70px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1015">
                            <div unselectable="on"  style="text-align:left;">COB</div></td>
                        <td  style="width: 51px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1016">
                            <div unselectable="on"  style="text-align:left;">REDEMPTION</div></td>
                        <td  data-partycount="1" style="width: 51px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1017">
                            <div unselectable="on"  style="text-align:left;">VINCE DIEZ PROPERTIES INC</div></td>
                        <td  style="width: 51px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1018">
                            <div unselectable="on"  style="text-align:left;">EDENBORNE DEVELOPMENT CO LLC</div></td>
                        <td  style="width: 103px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1019">
                            <div unselectable="on"  style="text-align:left;">TWN/RNG SECT: 45 TWN: 10 RANGE: 3 LOT: TRACT EB-1-A,TRACT EB-1-C,TRACT EB-2,TRACT EB-4-B COMMENT: PARCEL NO. 4541600 </div></td>
                        <td  style="width:80px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1020">
                            <div unselectable="on"  style="text-align:right;">$45,345</div></td>
                        <td  style="width:85px;" role="gridcell" tabindex="-1" data-columnid="gridcolumn-1021">
                            <div unselectable="on"  style="text-align:left;">&nbsp;</div></td>
                    </tr></tbody></table>
                    
            <table ... </table>
            <table ... </table>
            <table ... </table>
            <table ... </table>
            <table ... </table>
            <table ... </table>
            <table ... </table>
            <table ... </table>
                        
    </div></div>
    <div  role="progressbar" aria-hidden="true" aria-disabled="false" id="loadmask-1043" tabindex="0" data-componentid="loadmask-1043" style="display: none;" aria-valuetext="Loading...">
        <div id="loadmask-1043-msgWrapEl" data-ref="msgWrapEl"  role="presentation" style="left: 293px; top: 164px;">
            <div id="loadmask-1043-msgEl" data-ref="msgEl"  role="presentation">
                <div id="loadmask-1043-msgTextEl" data-ref="msgTextEl"  role="presentation">Loading...
</div></div></div></div></div>

That code contains the data in the results table, but I don't see anywhere that it provides a link to another page. How can I find the information for that link?

And here's another mystery. The code pasted above was obtained by right-clicking on the first full div tag visible in the display image above, then clicking on "Expand All", then "Copy > Outer HTML>". But if you compare the code in the image with the code pasted above, you can see that they're not identical. The content appears to be the same, but some of the attributes appear in a different order. For example, in the first div tag, the attribute data-ref="body" appears in a different order in the display image and in the pasted code, and in the second div tag, the attribute id="gridview-1012" appears in a different order. I guess that's not significant, and just means that the "Copy > Outer HTML" facility doesn't necessarily preserve attribute order, which I haven't noticed before.

Now, at the end of that second div tag, in the display image there's a button, "Event", with nothing corresponding to it in the pasted code. When I click on the button and then click in the strip that comes up, I get a box containing part of the code for a JavaScript function:

Event function

function ion(this.doDirectEvent, this, [a, !1])
}
else {
  this.doDirectEvent(a, !1)
}
}, onDirectCaptureEvent: function(a) {
    if (Ext.ele

I searched for "ion(" in the the HTML for the whole results page in the code inspector and there were no matches. So I guess this function is defined in an included CSS file. If I hover over the header bar of the box, I see that that header is the tail end of a URL. I don't see a way to copy that URL to the clipboard in order to show it to you here. I searched the results page code for the number at the end of the URL, "273196", and got no matches. I don't know if anything about this "Event" is relevant to my question of how to find the URLs that each row links to. There are no more "Event" buttons in the rest of the code of the results table.

So, how does this code know that clicking on a row of that table should bring up a details page? And where is the information hidden that says what the URL of that page will be?

CodePudding user response:

I think you are looking at a Dynamically loaded content.

You can check in the Network tab in the Developer Tools. You are looking for a request of type `xhr.

In Firefox you can filter the XHR requests by clicking here: enter image description here

Check the requests and responses for all the 'XHR' requests, you will probably find your information returned there as a JSON or an HTML document.

Please let us know how it goes!

I suggest you check out this video for web-scraping dynamically loaded content: https://www.youtube.com/watch?v=Pu3gmdWsLYc

The video is for a Python library called Scrapy. I suggest you check it out if you are to deal with web-scraping - it is a must!

That said you the video covers the basics without the need to know Python or Scrapy.

  • Related