Home > Software engineering >  Protocol Error|404 reading JSON from API with HttpClient or WebClient
Protocol Error|404 reading JSON from API with HttpClient or WebClient

Time:10-25

Trying to use this service to fetch JSON data for further processing locally:

https://www.sec.gov/edgar/sec-api-documentation

Example for testing (from documentation link above):

https://data.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json

Which works great in the browser, however I have tried several methods (listed below) all fail primarily with Protocol Error / 404

Per documentation https://www.sec.gov/os/webmaster-faq#developers

I have incorporated the required header entries (I think correctly)

I keep reading that the preferred method is to use HttpClient, however all the simple examples I encountered are C# and/or use lambda expression which I don't understand. I attempted to translate to VB as best I can - Not sure if I got it right.

I have also added this to the App.config:

<system.net>
  <settings>
    <httpWebRequest useUnsafeHeaderParsing = "true"/>
  </settings>
</system.net>

The application is Win Forms app on Framework 4.7.2 the OS is Windows 10 Pro, 20H2, 64-bit operating system, x64-based processor

Imports System.IO
Imports System.Net
Imports System.Net.Http

Public Module SECJSON

    Public sourceUrl As String = "https://data.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json"

    Public ua As String = "My Edgar Reader [email protected]"
    Public accept As String = ""
    'Public accept As String = "application/json"
    'Public accept As String = "text/plain"
    'Public accept As String = "*/*"

    Public Sub RunAttempts()
        Debug.WriteLine("HttpWebRequest:")
        Debug.WriteLine(Fetch1(sourceUrl))
        Debug.WriteLine("-----------")
        Debug.WriteLine("")

        Call Fetch2(sourceUrl)

        Debug.WriteLine("WebClient:")
        Debug.WriteLine(Fetch3(sourceUrl))
        Debug.WriteLine("-----------")
        Debug.WriteLine("")

        Debug.WriteLine("WebClient File:")
        Debug.WriteLine(Fetch4(sourceUrl))
        Debug.WriteLine("-----------")
        Debug.WriteLine("")

    End Sub

    Public Function Fetch1(url As String) As String
        Dim request As HttpWebRequest
        Dim response As HttpWebResponse
        Dim reader As StreamReader

        'request = WebRequest.Create(url)
        request = DirectCast(WebRequest.Create(url), HttpWebRequest)
        request.ContentType = "application/json"
        'request.Method = "POST"
        request.Method = "GET"

        request.UserAgent = ua
        request.Host = "www.sec.gov"
        request.Headers("Accept-Encoding") = "gzip, deflate"

        If Not (accept = "") Then request.Accept = accept

        Dim result As String = ""

        Try
            response = DirectCast(request.GetResponse(), HttpWebResponse)
            If response Is Nothing = False Then
                reader = New StreamReader(response.GetResponseStream())
                result = reader.ReadToEnd()
            Else
                result = "No Data"
            End If
        Catch ex As Exception
            result = ex.Message
        End Try

        Return result
    End Function

    Public Async Sub Fetch2(page As String)

        Dim request As New HttpRequestMessage(HttpMethod.Get, New Uri(page))
        If Not (accept = "") Then request.Headers.Accept.Add(New Headers.MediaTypeWithQualityHeaderValue(accept))
        request.Headers.TryAddWithoutValidation("User-Agent", ua)
        request.Headers.TryAddWithoutValidation("Host", "www.sec.gov")
        request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate")

        Dim client As HttpClient = New HttpClient

        Using response As HttpResponseMessage = Await client.SendAsync(request)
            Using content As HttpContent = response.Content
                'content.Headers.Remove("Content-Type")
                'content.Headers.Add("Content-Type", "application/json")
                Dim result As String = Await content.ReadAsStringAsync()
                If result Is Nothing Then
                    Debug.WriteLine("HttpClient:No result")
                Else
                    Debug.WriteLine("HttpClient:")
                    Debug.WriteLine(result)
                    Debug.WriteLine("-----------")
                    Debug.WriteLine("")
                End If
            End Using
        End Using
    End Sub

    Public Function Fetch3(url As String) As String

        Dim client As New WebClient()
        'ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12

        client.Headers.Add("User-Agent", ua)
        client.Headers.Add("Host", "www.sec.gov")
        client.Headers.Add("Accept-Encoding", "gzip, deflate")

        If Not (accept = "") Then client.Headers.Add("accept", accept)

        Dim ans As String = ""
        Try
            ans = client.DownloadString(New Uri(url))
        Catch ex As WebException
            If ex.Response Is Nothing = False Then
                'Dim dataStream As Stream = ex.Response.GetResponseStream
                'Dim reader As New StreamReader(dataStream)
                'ans = reader.ReadToEnd()
                ans = ex.Message & "|" & ex.Status.ToString
            Else
                ans = "No Response Stream"
            End If
        End Try

        Return ans
    End Function

    Public Function Fetch4(url As String) As String

        Dim client As New WebClient()
        Dim tmpFile As String = Path.Combine(Path.GetTempPath(), "edgar.txt")
        Debug.WriteLine(tmpFile)
        Try
            client.DownloadFile(New Uri(url), tmpFile)
            Return File.ReadAllText(tmpFile)
        Catch ex As WebException
            If ex.Response Is Nothing = False Then
                Return ex.Message & "|" & ex.Response.ContentType & " " & ex.Response.ContentLength.ToString
            Else
                Return ex.Message
            End If
        End Try

    End Function

End Module

CodePudding user response:

the problem is your host header is set to 'www.sec.gov' so the request is going to 'https://www.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json' instead of 'https://data.sec.gov/api/xbrl/frames/us-gaap/AccountsPayableCurrent/USD/CY2019Q1I.json'

The Solution is simple just edit the host header from:

request.Host = "www.sec.gov" to request.Host = "data.sec.gov"

#Next time if you can't identify the problem try using Fiddler to see whats wrong with your request.

  • Related