I'm trying to save the links only of the sample pages in this website MusicRadar
require 'open-uri'
require 'nokogiri'
link = 'https://www.musicradar.com/news/tech/free-music-samples-royalty-free-loops-hits-and-multis-to-download'
html = OpenURI.open_uri(link)
doc = Nokogiri::HTML(html)
#used grep because every sample link in that page ends with '-samples'
doc.xpath('//div/a/@href').grep(/-samples/)
The problem is that it only finds 3 of that links What am I doing wrong? And If i wanted to open each of that links?
CodePudding user response:
CSS selectors are more useful than XPath (if the document structure is good enough for that)
Now you used XPath with similar to CSS selector div > a
, but you don't need it because for example some of the links inside p
If you need all links with -samples
you can use *=
selector
doc.css('a[href*="-samples"]') # return Nokogiri::XML::NodeSet with matched elements
doc.css('a[href*="-samples"]').map { |a| a[:href] } # return array of URLS