Home > Enterprise >  Download historical prices from yahoo finance cookie issue?
Download historical prices from yahoo finance cookie issue?

Time:05-03

I have an excel file that download historical prices from yahoo finances (I usually download last 10 years) for stocks.

However, since a couple of days, I have an issue and it seems it is coming from the cookie. When I run the program I have the below message (sorry it is in French)

error message from Excel

I can remove them but then the program only download the last 8 months.

Dim crumb As String
Dim cookie As String
Dim response As String
Dim strURL As String
Dim DownloadURL As String
Dim period1, period2 As Double
Dim startdate2 As Date
Dim httpReq As WinHttp.WinHttpRequest
Set httpReq = New WinHttp.WinHttpRequest

Application.ScreenUpdating = False

DownloadURL = "https://finance.yahoo.com/lookup?s=" & stockSymbol

startdate2 = DateValue("01/01/1970")
period1 = CDbl((startDate - startdate2) * 86400)
period2 = CDbl((endDate - startdate2) * 86400)

With httpReq
    .Open "GET", DownloadURL, False
    .setRequestHeader "Content-Type", "application/x-www-form-urlencoded;charset=UTF-8"
    .send
    .waitForResponse
    response = .responseText
    cookie = Split(.getResponseHeader("Set-cookies"), ";")(0)
End With

Dim counter As Long
Dim startCounter As Long
Dim result As String

crumb = Chr(34) & "CRUMB" & Chr(34) & ":{" & Chr(34) & ".crumb" & Chr(34) & ":" & Chr(34)
startCounter = InStr(response, crumb)   Len(crumb)
While Mid(response, startCounter, 1) <> Chr(34)
    result = result & Mid(response, startCounter, 1)
    startCounter = startCounter   1
Wend
crumb = result

DownloadURL = "https://query1.finance.yahoo.com/v7/finance/download/" & stockSymbol & "?period1=" & period1 & "&period2=" & period2 & "&interval=1d&events=history&crumb=" & crumb

With httpReq
    .Open "GET", DownloadURL, False
    .setRequestHeader "Cookie", cookie
    .setRequestHeader "strict-transport-security", "max-age=0"
    .send
    Debug.Print .getAllResponseHeaders
    .waitForResponse
    dataresult = .responseText
    
    Debug.Print dataresult
End With

dataresult = Replace(dataresult, ",", vbTab)

Thank you in advance for your help :)

CodePudding user response:

I also started seeing this same issue a day or so ago in my c# program. I first manually attempted to 'override' the cookie being used to one I scraped from my web browser, which worked until it didn't haha.

From what I can see, I think yahoo is playing around with their API throttling, which is unfortunate since I cant seem to find an official documentation for what is 'allowed' for requests per hour/day/minute. Best I found said that they 'officially' limit it to 2000 requests per hour, but in my testing I have been able to get up to ~7000 per hour with no issues since I started noticing this issue. (another source said they got to 8000 requests in 15 minutes before getting a 999 error but I was not able to get the same results, so keep that in mind maybe)

My advice would be to implement some throttling manually, so you can personally configure it to stress test it. Not sure how you would do that in Excel, but here is a really basic way it is done in c#

    private static readonly DateTime Jan1st1970 = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
    public static long CurrentTimeMillis()
    {
        return (long)(DateTime.UtcNow - Jan1st1970).TotalMilliseconds;
    }
    public static long lastCalled { get; set; } = CurrentTimeMillis();
    public static int requestsThisHour = 0;
    public static long latestHourStartMilis = CurrentTimeMillis();
    public static long latestHourEndMilis { get; set; }
    public static int maxRequestsPerHour = 7000;

    public static async Task<string> GetRawAsync(string symbol, DateTime start, DateTime end, string eventType = "history")
    {
        string csvData = null;

        // check if it has been 1 hour since starting, if so, set hourmili counters to new values
        if (CurrentTimeMillis() > latestHourEndMilis)
        {
            latestHourStartMilis = CurrentTimeMillis();
            latestHourEndMilis = latestHourStartMilis   3600000;
            requestsThisHour = 0;
        } else
        {
            requestsThisHour  ;
        }
        
        // Throttle
        var currentRequestRate = (CurrentTimeMillis() - latestHourStartMilis) / requestsThisHour;
        var desiredRequestRate = (latestHourEndMilis - latestHourStartMilis) / maxRequestsPerHour;
        utils.log.Debug("Average Rate: "   currentRequestRate   "ms | Max Desired Rate: "   desiredRequestRate   "ms"); // debug output since I am still testing this
        while ((CurrentTimeMillis() - lastCalled) < desiredRequestRate)
        {
            int timeToWait = (int)desiredRequestRate - (int)(CurrentTimeMillis() - lastCalled);
            utils.log.Debug("Waiting: "   timeToWait   "ms");// debug output since I am still testing this
            await Task.Delay(timeToWait);
        }

        // make request for historical data & return it
    }

I hope this helps, or at least yahoo decides to publicly let it be known what they limit requests to haha.

Good Luck!

CodePudding user response:

Looks like Yahoo are no longer return the cookie that they always used to return. Duplicate of Historical Yahoo Finance API On Fritz Again?.

My code is:

      const string userAgentKey = "User-Agent";
        const string userAgentValue =
            "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36";

        // random query to avoid cached response
        var client = new FlurlClient($"https://finance.yahoo.com?{GetRandomString(8)}")
            .WithHeader(userAgentKey, userAgentValue)
            .EnableCookies();

        await client.Request().GetAsync(token).ConfigureAwait(false);

        if (client.Cookies?.Count > 0)
            return client;

But client.Cookies?.Count is always == 0 now as of a couple of days ago. I don't believe it is connected with throttling as I haven't used system for couple of days, and this is happening on very first call.

  • Related