I am trying to extract the 24hr Volumes from this page. They have an API except it seems as though volume isn't returned in the json data (at least I can't get it to work). I have tried simple scraping using regex and am now using the lxml
xpath methond.
What can I do to get the 24hr volume from this page?? Is it protected?
This is my latest code:
from lxml import html
import requests
swyftx_page = requests.get('https://swyftx.com/au/buy/bitcoin/')
swyftx_tree = html.fromstring(swyftx_page.content)
swyftx_prices_btc = swyftx_tree.xpath('/html/body/section[1]/div/div[2]/div/div[2]/div[2]/div[3]/h3/text()')
print(swyftx_prices_btc)
When I run this, I get:
['$0.00']
Which is obviously not right. I am expecting an answer like:
['34,560,324,200']
CodePudding user response:
The data you see on the page is loaded from external URL via JavaScript. To simulate it via requests
module, you can use this example:
import json
import requests
url = "https://apic.swyftx.io/markets/aud/"
data = requests.get(url).json()
# uncomment to print all data:
# print(json.dumps(data, indent=4))
for d in data:
if d["name"] == "Bitcoin":
print("Volume:", d["volume24H"])
break
Prints:
Volume: 34974203469
CodePudding user response:
Disable Javascript for this page in your browser. As you can see the 24h volume is initially set to $0.00. This means that it must be updated later using Javascript. You can try to find what network access is made and access that instead.