I am making a Web scraper that scrapes Yahoo Finance and tells me what the current stock price is.
I keep getting an error like this after running the program
IndexError: list index out of range
this is the code
def parsePrice():
r=requests.get('https://finance.yahoo.com/quote/F?p=F')
soup=bs4.BeautifulSoup(r.text,'xml')
#the next line is the supposed problem
price=soup.find_all('div',{'class': 'My(6px) Pos(r) smartphone_Mt(6px)'})[0].Find('span').text
return price
while True:
print('the current price is: ' str(parsePrice()))
I am a beginner into python so any help would be appreciated :)
CodePudding user response:
What happens?
Note Always look at your soup first - therein lies the truth. The content can always be slightly to extremely different from the view in the dev tools.
There is no <div>
with such a class your searching for in the soup and that is why the resultset is empty and could not match on picking index [0]
How to fix?
Add some
headers
to your request, to show up you might be a "browser":headers ={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'}
Select your element more specific - Cause you know the data symbol from your request you can select it directly:
soup.select_one('[data-symbol="F"]')['value']
Example
Note First rule of scraping: do not harm the website! Means that the volume and frequency of queries you make should not burden the websites, servers. So please, add some delay (import time -> time.sleep(60)
) between your requests or use an offical api
import bs4
import requests
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
def parsePrice():
r=requests.get('https://finance.yahoo.com/quote/F?p=F', headers=headers)
soup=bs4.BeautifulSoup(r.text,'xml')
price = soup.select_one('[data-symbol="F"]')['value']
return price
while True:
print('the current price is: ' str(parsePrice()))
Output
the current price is: 20.25
the current price is: 20.25
the current price is: 20.25
the current price is: 20.25