Home > Enterprise >  How to loop through this query using Jsoup?
How to loop through this query using Jsoup?

Time:12-04

I want to loop through the news table and get the title and rating of each row. I tried different options, but I can’t understand why the select method receives all the options at once. I want to loop through the news table and get the title and rating of each row. I tried different options, but I can’t understand why the select method receives all the options at once. I need to get each news block in a loop.

I used this way to get table link:

Elements elements = document.select("#hnmain > tbody > tr:nth-child(3) > td > table");
for (Element element: elements){
    
}

Sample data from html:

<table border="0" cellpadding="0" cellspacing="0">
 <tbody>
  <tr  id="33582264">
   <td align="right" valign="top" ><span >1.</span></td>
   <td valign="top" >
    <center>
     <a id="up_33582264" href="vote?id=33582264&amp;how=up&amp;goto=front?day=2022-11-13">
      <div  title="upvote"></div></a>
    </center></td>
   <td ><span ><a href="https://upbase.io/">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.</a><span > (<a href="from?site=upbase.io"><span >upbase.io</span></a>)</span></span></td>
  </tr>
  <tr>
   <td colspan="2"></td>
   <td ><span > <span  id="score_33582264">632 points</span> by <a href="user?id=tonypham" >tonypham</a> <span  title="2022-11-13T12:00:06"><a href="item?id=33582264">20 days ago</a></span> <span id="unv_33582264"></span> | <a href="hide?id=33582264&amp;goto=front?day=2022-11-13">hide</a> | <a href="item?id=33582264">456&nbsp;comments</a> </span></td>
  </tr>
  <tr  style="height:5px"></tr>
  <tr  id="33584941">
   <td align="right" valign="top" ><span >2.</span></td>
   <td valign="top" >
    <center>
     <a id="up_33584941" href="vote?id=33584941&amp;how=up&amp;goto=front?day=2022-11-13">
      <div  title="upvote"></div></a>
    </center></td>
   <td ><span ><a href="https://fathy.fr/html2svg">Forking Chrome to turn HTML into SVG</a><span > (<a href="from?site=fathy.fr"><span >fathy.fr</span></a>)</span></span></td>
  </tr>

CodePudding user response:

Here is an example of how you can loop through the news table using Jsoup:

// First, select the table with the news
Elements elements = document.select("#hnmain > tbody > tr:nth-child(3) > td > table");

// Then, loop through the table
for (Element element: elements){
// For each news item, select the title and rating
String title = element.select(".title a").text();
String rating = element.select(".subtext .score").text();
// Print the title and rating of each news item
System.out.println("Title: "   title);
System.out.println("Rating: "   rating);
}

This code should print the title and rating of each news item in the table.

For example, for the first news item in the sample data, it would print:

Title: Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.

Rating: 632 points

CodePudding user response:

if I understand your question I think this code will work for you

Document doc = Jsoup.parse("<table border=\"0\" id=\"hnmain\" cellpadding=\"0\" cellspacing=\"0\"> <tbody> <tr class=\"athing\" id=\"33582264\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">1.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33582264\" href=\"vote?id=33582264&amp;how=up&amp;goto=front?day=2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\"><a href=\"https://upbase.io/\">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.</a><span class=\"sitebit comhead\"> (<a href=\"from?site=upbase.io\"><span class=\"sitestr\">upbase.io</span></a>)</span></span></td> </tr> <tr> <td colspan=\"2\"></td> <td class=\"subtext\"><span class=\"subline\"> <span class=\"score\" id=\"score_33582264\">632 points</span> by <a href=\"user?id=tonypham\" class=\"hnuser\">tonypham</a> <span class=\"age\" title=\"2022-11-13T12:00:06\"><a href=\"item?id=33582264\">20 days ago</a></span> <span id=\"unv_33582264\"></span> | <a href=\"hide?id=33582264&amp;goto=front?day=2022-11-13\">hide</a> | <a href=\"item?id=33582264\">456&nbsp;comments</a> </span></td> </tr> <tr class=\"spacer\" style=\"height:5px\"></tr> <tr class=\"athing\" id=\"33584941\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">2.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33584941\" href=\"vote?id=33584941&amp;how=up&amp;goto=front?day=2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\"><a href=\"https://fathy.fr/html2svg\">Forking Chrome to turn HTML into SVG</a><span class=\"sitebit comhead\"> (<a href=\"from?site=fathy.fr\"><span class=\"sitestr\">fathy.fr</span></a>)</span></span></td> </tr>");
    Elements elements = doc.select("#hnmain .athing");
    for (Element element : elements) {
        String title = element.select(".title").text();
        String rank = element.select(".rank").text();
        
        System.out.println(title   " -- " rank);
    }
  • Related