I want to loop through the news table and get the title and rating of each row. I tried different options, but I can’t understand why the select method receives all the options at once. I want to loop through the news table and get the title and rating of each row. I tried different options, but I can’t understand why the select method receives all the options at once. I need to get each news block in a loop.
I used this way to get table link:
Elements elements = document.select("#hnmain > tbody > tr:nth-child(3) > td > table");
for (Element element: elements){
}
Sample data from html:
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr id="33582264">
<td align="right" valign="top" ><span >1.</span></td>
<td valign="top" >
<center>
<a id="up_33582264" href="vote?id=33582264&how=up&goto=front?day=2022-11-13">
<div title="upvote"></div></a>
</center></td>
<td ><span ><a href="https://upbase.io/">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.</a><span > (<a href="from?site=upbase.io"><span >upbase.io</span></a>)</span></span></td>
</tr>
<tr>
<td colspan="2"></td>
<td ><span > <span id="score_33582264">632 points</span> by <a href="user?id=tonypham" >tonypham</a> <span title="2022-11-13T12:00:06"><a href="item?id=33582264">20 days ago</a></span> <span id="unv_33582264"></span> | <a href="hide?id=33582264&goto=front?day=2022-11-13">hide</a> | <a href="item?id=33582264">456 comments</a> </span></td>
</tr>
<tr style="height:5px"></tr>
<tr id="33584941">
<td align="right" valign="top" ><span >2.</span></td>
<td valign="top" >
<center>
<a id="up_33584941" href="vote?id=33584941&how=up&goto=front?day=2022-11-13">
<div title="upvote"></div></a>
</center></td>
<td ><span ><a href="https://fathy.fr/html2svg">Forking Chrome to turn HTML into SVG</a><span > (<a href="from?site=fathy.fr"><span >fathy.fr</span></a>)</span></span></td>
</tr>
CodePudding user response:
Here is an example of how you can loop through the news table using Jsoup:
// First, select the table with the news
Elements elements = document.select("#hnmain > tbody > tr:nth-child(3) > td > table");
// Then, loop through the table
for (Element element: elements){
// For each news item, select the title and rating
String title = element.select(".title a").text();
String rating = element.select(".subtext .score").text();
// Print the title and rating of each news item
System.out.println("Title: " title);
System.out.println("Rating: " rating);
}
This code should print the title and rating of each news item in the table.
For example, for the first news item in the sample data, it would print:
Title: Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.
Rating: 632 points
CodePudding user response:
if I understand your question I think this code will work for you
Document doc = Jsoup.parse("<table border=\"0\" id=\"hnmain\" cellpadding=\"0\" cellspacing=\"0\"> <tbody> <tr class=\"athing\" id=\"33582264\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">1.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33582264\" href=\"vote?id=33582264&how=up&goto=front?day=2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\"><a href=\"https://upbase.io/\">Show HN: I built my own PM tool after trying Trello, Asana, ClickUp, etc.</a><span class=\"sitebit comhead\"> (<a href=\"from?site=upbase.io\"><span class=\"sitestr\">upbase.io</span></a>)</span></span></td> </tr> <tr> <td colspan=\"2\"></td> <td class=\"subtext\"><span class=\"subline\"> <span class=\"score\" id=\"score_33582264\">632 points</span> by <a href=\"user?id=tonypham\" class=\"hnuser\">tonypham</a> <span class=\"age\" title=\"2022-11-13T12:00:06\"><a href=\"item?id=33582264\">20 days ago</a></span> <span id=\"unv_33582264\"></span> | <a href=\"hide?id=33582264&goto=front?day=2022-11-13\">hide</a> | <a href=\"item?id=33582264\">456 comments</a> </span></td> </tr> <tr class=\"spacer\" style=\"height:5px\"></tr> <tr class=\"athing\" id=\"33584941\"> <td align=\"right\" valign=\"top\" class=\"title\"><span class=\"rank\">2.</span></td> <td valign=\"top\" class=\"votelinks\"> <center> <a id=\"up_33584941\" href=\"vote?id=33584941&how=up&goto=front?day=2022-11-13\"> <div class=\"votearrow\" title=\"upvote\"></div></a> </center></td> <td class=\"title\"><span class=\"titleline\"><a href=\"https://fathy.fr/html2svg\">Forking Chrome to turn HTML into SVG</a><span class=\"sitebit comhead\"> (<a href=\"from?site=fathy.fr\"><span class=\"sitestr\">fathy.fr</span></a>)</span></span></td> </tr>");
Elements elements = doc.select("#hnmain .athing");
for (Element element : elements) {
String title = element.select(".title").text();
String rank = element.select(".rank").text();
System.out.println(title " -- " rank);
}