Home > Software engineering >  Why can't I return a figure when scraping html page?
Why can't I return a figure when scraping html page?

Time:03-17

I am trying to extract the 24hr Volumes from this page. They have an API except it seems as though volume isn't returned in the json data (at least I can't get it to work). I have tried simple scraping using regex and am now using the lxml xpath methond.

What can I do to get the 24hr volume from this page?? Is it protected?

This is my latest code:

from lxml import html
import requests

swyftx_page = requests.get('https://swyftx.com/au/buy/bitcoin/')
swyftx_tree = html.fromstring(swyftx_page.content)
swyftx_prices_btc = swyftx_tree.xpath('/html/body/section[1]/div/div[2]/div/div[2]/div[2]/div[3]/h3/text()')
print(swyftx_prices_btc)

When I run this, I get:

['$0.00']

Which is obviously not right. I am expecting an answer like:

['34,560,324,200']

CodePudding user response:

The data you see on the page is loaded from external URL via JavaScript. To simulate it via requests module, you can use this example:

import json
import requests


url = "https://apic.swyftx.io/markets/aud/"

data = requests.get(url).json()

# uncomment to print all data:
# print(json.dumps(data, indent=4))


for d in data:
    if d["name"] == "Bitcoin":
        print("Volume:", d["volume24H"])
        break

Prints:

Volume: 34974203469

CodePudding user response:

Disable Javascript for this page in your browser. As you can see the 24h volume is initially set to $0.00. This means that it must be updated later using Javascript. You can try to find what network access is made and access that instead.

  • Related