I am currently trying to scrape BitCoin historical financial data from Yahoo Finance while still being able to choose when I want to start mining the data. My code is as follows, however an error appears telling me "HTTP Error 404: Not Found" ... Can you tell me where this error is coming from and how to fix it?
import time
import datetime
import pandas as pd
period1 = int(time.mktime(datetime.datetime(2020, 1, 1, 23, 59).timetuple()))
period2 = int(time.mktime(datetime.datetime(2022, 1, 1, 23, 59).timetuple()))
basic_url = 'https://fr.finance.yahoo.com/quote/BTC-USD/historyperiod1=1606780800&period2=1609372800&interval=1dk&filter=history&frequency=1wk&includeAdjustedClose=true'
modified_url = 'https://fr.finance.yahoo.com/quote/BTC-USD/history?period1={period1}&period2={period2}&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true'
df=pd.read_csv(modified_url)
df
Thanks in advance !
Thibaut
CodePudding user response:
Your logic is sound.
However your address seems to be wrong. I checked their site and direct working link is:
modified_url =f'https://query1.finance.yahoo.com/v7/finance/download/BTC-USD?period1={period1}&period2={period2}&interval=1d&events=history&includeAdjustedClose=true'
Edit:
Since you have converted into Integer already there is no need for f-string. Also try not to use variables as 'a'. They are pain to figure out later.
input_year = int(input("A partir de quelle année voulez-vous commencer le scraping de données?\n"))
period1 = int(time.mktime(datetime.datetime(input_year, 1, 1, 23, 59).timetuple()))
CodePudding user response:
If you try and visit the URL on your browser then you can see the problem. It doesn't go anywhere and is re-directed - So 404.
To fix it use a URL that works. Perhaps you have made a mistake in the GET params?
CodePudding user response:
Maybe a formatter is missing :
modified_url = f'https://fr.finance.yahoo.com/quote/BTC-USD/history?period1={period1}&period2={period2}&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true'