Home > database >  How to pick first ul tag after h4 tag skipping any other tag between them, with Nokigiri?
How to pick first ul tag after h4 tag skipping any other tag between them, with Nokigiri?

Time:05-25

I'm trying toy get the first ul tag after the tag h4 and skipping the div tags:

<h4>
 <a>
  "Q1. some text"
 </a>
</h4>
<ul>
 <li>answer</li>
 <li>answer</li>
 <li>answer</li>
</ul>

<h4>
 <a>
  "Q2. Some text"
 </a>
</h4>
<ul>
 <li>answer</li>
 <li>answer</li>
 <li>answer</li>
</ul>

<h4>
 <a>
  "Q2. Some text"
 </a>
</h4>
<div>WITH OTHER INFO THAT i DON'T WANT</div>
<ul>
 <li>answer</li>
 <li>answer</li>
 <li>answer</li>
</ul>

<h4>
 <a>
  "Q2. Some text"
 </a>
</h4>
<div>WITH OTHER INFO THAT i DON'T WANT</div>
<ul>
 <li>answer</li>
</ul>
<ul>
 <li>DONT NEED THIS</li>
</ul>
<ul>
 <li>DONT NEED THIS</li>
</ul>

And this code is mostly like this a bunch of times so I need to pick just the first ul tag after the h4 and skipping the div tags with nokigiri and ruby.

require 'nokogiri'

doc = Nokogiri.HTML(DATA) 

CodePudding user response:

You want to use either the following-sibling or following axis, and specify the first matching ul:

doc.xpath('//h4/following-sibling::ul[1]')
  • Related