i need to display the principal row of this table, with another table nestint
<html><body>
<div id = div1><table><tbody>
<tr><td>Steve</td>
<td><table><tbody><tr><td>Steve2</td></tr></tbody></table>"
</tr></tbody></table></body></html>
The rows can be more than once.
I want to extract then content of the tr at the first level (not <tr><td>Steve2</td></tr>
).
This is the code:
String html = "<html><body>"
"<div id = div1><table><tbody>"
"<tr><td>Steve</td>"
"<td><table><tbody><tr><td>Steve2</td></tr></tbody></table>"
"</tr></tbody></table></body></html>";
doc = Jsoup.parse(html);
Elements elemHtml = doc.select("div#div1>table");
for(Element elem1:elemHtml) {
Elements elem2 = elem1.select("tr");
for(Element elem3:elem2) {
System.out.println("Content: " elem3);
System.out.println("----------");
}
}
I tried to add <div>
tag inside the table but the parse doesn't work.
CodePudding user response:
Change your css selector to div#div1>table>tboby>tr
to map only the <tr>
that are directly under your <tobdy>
element, that's what >
means in css
CodePudding user response:
I've made some more complex html, to show that the solution works for a more general case than the one in the question:
<html> <body> <div id = div1> <table> <tbody>
<tr> <td>Steve1</td> <td> <table> <tbody> <tr>
<td>Steve2a</td> </tr> <tr> <td>Steve2b</td>
</tr> </tbody> </table> </tr> <tr> <td>Steve3</td>
<td> <table> <tbody> <tr> <td>Steve4</td> </tr>
</tbody> </table> </tr> </tbody> </table>
</body> </html>
which results in the following table:
Use the following selector to get all the table's rows - div#div1>table> tbody > tr
and then iterate over these rows to get the first row - select("td").first()
.
Full code -
Document doc = null;
String html2 = "<html> <body> <div id = div1> <table> <tbody>"
"<tr> <td>Steve1</td> <td> <table> <tbody> <tr>"
"<td>Steve2a</td> </tr> <tr> <td>Steve2b</td>"
"</tr> </tbody> </table> </tr> <tr> <td>Steve3</td>"
"<td> <table> <tbody> <tr> <td>Steve4</td> </tr>"
"</tbody> </table> </tr> </tbody> </table>"
"</body> </html>";
doc = Jsoup.parse(html2);
Elements outerRows = doc.select("div#div1>table> tbody > tr");
for(Element row : outerRows) {
Element data = row.select("td").first();
System.out.println(data);
System.out.println("------------");
}
If you want only the text (SteveX) than you can get it with the text
method:
System.out.println(data.text());