I am looking to do scraping of the website
require 'nokogiri'
require 'open-uri'
require 'pp'
require 'csv'
unless File.readable?('data.html')
url = 'https://www.bananatic.com/de/forum/games/'
data = URI.open(url).read
File.open('data.html', 'wb') { |f| f << data }
data = File.read('data.html')
document = Nokogiri::HTML(data)
per = document.xpath('//div[@]/text()[string-length(normalize-space(.)) > 0]')
.map { |node| node.to_s[/\d /] }
p per
pir = document.xpath('//div[@]/text()[string-length(normalize-space(.)) > 0]')
.map { |node| node.to_s[/\w /] }
p pir
links2 = document.css('.topics ul li div')
res = links2.map do |lk|
name = lk.css('.name p a').inner_text
p res
To fix it I have added a regular expression, however I have failed in the attempt.
I just replace .inner_textwith .to_s[/\w /], but I don't get it.