I'm trying to get BTC volume data by using beautifulsoup
from https://finance.yahoo.com
source of yahoo finance
<tr ><td ><span>Volume</span></td><td data-test="TD_VOLUME-value"><fin-streamer data-symbol="BTC-USD" data-field="regularMarketVolume" data-trend="none" data-pricehint="2" data-dfield="longFmt" value="12,728,260,608" active="">12,728,260,608</fin-streamer></td></tr>
<tr ><td ><span>Volume (24hr)</span></td><td data-test="TD_VOLUME_24HR-value">12.73B</td></tr><tr ><td ><span>Volume (24hr) All Currencies</span></td><td data-test="TD_VOLUME_24HR_ALLCURRENCY-value">12.73B</td></tr>
I'm trying something like that
url1v = "https://finance.yahoo.com/quote/BTC-USD"
page1v = requests.get(url1v)
html_page1v = BeautifulSoup(page1v.content, "html.parser")
btc_volume = html_page1v.find("span", {"class":"e3b14781 dde7f18a"})
print(btc_volume)
output: none
how should I write the code?
CodePudding user response:
First at all and always check your response
/ soup
, what is not in you will not find.
It needs an user-agent
otherwise the correct content would be blocked or not provided by the server.
Example
The user agent in the example is fictitious (guess the movie), as the content itself does not seem to be checked - Check user-agent
for further information:
from bs4 import BeautifulSoup
import requests
url = 'https://finance.yahoo.com/quote/BTC-USD'
soup = BeautifulSoup(requests.get(url,headers={'user-agent':'Tom Bishop ;)'}).text)
soup.select_one('[data-field="regularMarketVolume"]').text
Output
12,394,450,944