Home > front end >  Google Sheets - Scrape table involved with pagination
Google Sheets - Scrape table involved with pagination

Time:11-27

I'm trying to find a work around with google sheets. I'm pulling data from finviz.com to build out custom stock screeners, but the only issue is they make use of pagination, therefore only allowing 20 rows for the first few results. I've checked that if I click on the 2nd page results in the pagination section of the table, only the URL changes, indicating the first row of the new table. Meaning if my first result page would be 20 rows, the second result page URL would have a parameter like this "r=21" indicating the first row of the second page results. Now how would I go about this to ensure that I'm pulling all the data once pagination of the table is in place? Also, checking the source of the page, these new parameters are stored into href's, meaning if our pagination had 3 pages as results, then within the <table/> elements we can see the new urls in href's, for example:

<table>
  <a href="screener.ashx?v=111&f=targetprice_a5&r=21"/>
  <a href="screener.ashx?v=111&f=targetprice_a5&r=41"/>
  <a href="screener.ashx?v=111&f=targetprice_a5&r=61"/>
</table>

Take note only one new parameter is added to the url "r=21", the rest are consistant throughout different result pages.

Is this even possible with google sheets?

Here's what I enter image description here

  • Related