Home > Mobile >  Unable to click download button with web scraper
Unable to click download button with web scraper

Time:06-26

for whatever reason mechanize is unable to click this export button to automate downloading a public government csv file, i'm trying to automate downloading a fishing report, does anyone have any ideas on how to get it to work?

agent = Mechanize.new
url = 'https://nrm.dfg.ca.gov/FishPlants/'

log :debug, "reading HTML from #{url}"
page = agent.get(url)

log :debug, 'loaded page'
form = page.search('#aspnetForm').first
button = page.search('.application_button').first

log :debug, 'clicking Export button'
response = agent.submit(form, button)

i get a stack trace with the following error

...
    form.add_button_to_query(button) if button
        ^^^^^^^^^^^^^^^^^^^^
/Users/aronlilland/.rvm/gems/ruby-3.1.2/gems/mechanize-2.8.5/lib/mechanize.rb:581:in `submit'
/Users/aronlilland/Documents/dev/fishing-report/tasks/download/fishing_report.rake:27:in `block (2 levels) in <top (required)>'
Tasks: TOP => download:fishing_report
(See full trace by running task with --trace)

the form is confirmed returns successfully, but the button is an input field

the page seems relatively straight forward, so dont know why i'm unable to scrape it - and its also public data

<form method="post" action="./" id="aspnetForm">
  <!-- .... -->
  <input
    type="submit"
    name="ctl00$cphContentMiddle$btnExport"
    value="Export"
    id="ctl00_cphContentMiddle_btnExport"
    
  >
  <!-- .... -->
</form>

CodePudding user response:

agent.submit got the wrong type for form

The issue here could be seen if you had included the beginning of the stack trace:

.../ruby/gems/3.1.0/gems/mechanize-2.8.5/lib/mechanize.rb:581:in `submit': undefined method `add_button_to_query' for #<Nokogiri::XML::Element:0x1ea14 name="form" attributes=
[#<Nokogiri::XML::Attr:0xbd38 name="method" value="post">, ...

Mechanize expects the submit to act on a Mechanize::Form instance, but instead got an instance of Nokogiri::XML::Element, as you can see by adding this to your code:

form.class # => Nokogiri::XML::Element

If you check the docs for the Mechanize::Form class, you can see the example they give to get you the form object is this:

form = page.forms.first # => Mechanize::Form

as opposed to what you used:

form = page.search('#aspnetForm').first

The call to search here is delegated to Nokogiri, and therefore doesn't return the object type you need, but rather a Nokogiri element.

button also has the wrong type

By the way the same applies to this line:

button = page.search('.application_button').first

If you fix the type of form, you'll get into a similar issue with button not being the expected type. Again, there's an example in the docs showing how a button is found:

agent.submit(page.forms.first, page.forms.first.buttons.first)

You'll need to figure out how to find the specific button you need though, I haven't worked with Mechanize before, so I can't offer a suggestion here. Presumably there's a way to convert the button you find through search to a Mechanize::Form::Button instance.

  • Related