Home > other >  GitHub blocks my access token after calling 3-4 search API
GitHub blocks my access token after calling 3-4 search API

Time:01-30

I am using PyCurl to call GitHub search API and extract some information. here is the code snippet to call API.

from io import BytesIO
import pycurl

url = f"https://api.github.com/search/code?q=import+keras size:1..100 language:python&page=1&per_page=100"
output = BytesIO()
request = pycurl.Curl()
request.setopt(pycurl.HTTPHEADER, [f'Authorization: token {access_token}'])
request.setopt(request.URL, url)
request.setopt(request.WRITEDATA, output)
request.perform()

The problem is GitHub blocks my access token after just 3-4 requests. But in GitHub documentation, 5000 requests per hour is mentioned as the limitation for the number of requests. I am using Python 3.8 and PyCurl 7.44.1. Do you have any idea to resolve this problem?

CodePudding user response:

"The Search API has a custom rate limit, separate from the rate limit governing the rest of the REST API"

You can check your rate limit status like this:

curl \
  -H "Accept: application/vnd.github.v3 json" \
  https://api.github.com/rate_limit

CodePudding user response:

GitHub has a different rate limit for search requests because they are substantially more expensive than a normal API call. You can query them at the endpoint https://api.github.com/rate_limit.

However, in your case, you're seeing a secondary rate limit, which means that something you're doing looks suspicious and you're getting blocked for that reason. The only way you can find out why that is would be to contact GitHub Support.

I will point out that it's a best practice to use a unique, identifying User-Agent header so that your traffic can be distinguished from other traffic. That may or may not help here, but libcurl is a very common user-agent, and since there will be a nontrivial number of people using it for abusive purposes, it's possible that your traffic got flagged by an automated system for that reason.

  •  Tags:  
  • Related