Hi I have a Go backend with gorilla-mux that uses a third party API. I have some handlers that make requests to this API. My limits are 5 requests a second.
How could I implement some sort of overall rate limiting system where requests are queued and sent through only when capacity is available (or one slot out of the five is free)? Thanks.
CodePudding user response:
For rate-limiting requests to the 3rd party API you can use Golang library golang.org/x/time/rate
.
sample usage
package main
import (
"context"
"log"
"net/http"
"time"
"golang.org/x/time/rate"
)
func main() {
rl := rate.NewLimiter(rate.Every(10*time.Second), 50)
reqURL := "https://www.google.com"
c := http.Client{}
req, err := http.NewRequest("GET", reqURL, nil)
if err != nil {
log.Fatal("failed to create request: %v", err)
}
for i := 0; i < 300; i {
// Waiting for rate limiter
err = rl.Wait(context.Background())
if err != nil {
log.Println("failed to wait: %v", err)
}
// and doing the requests if the rate is not exceeded
_, err := c.Do(req)
if err != nil {
log.Println("failed to wait: %v", err)
}
}
}
IMPORTANT!!! It's not a solution for multiple instances usage! If you're spawning multiple servers you should think about using Redis for synchronizing limiters (https://github.com/go-redis/redis_rate).
CodePudding user response:
If I understand well, you have an application X that should connect to an API Y and X can’t send more than 5 requests per second to Y.
This is a complex and incomplete scenario. Let me ask few questions
- What is the expected load on X? If it is below 5 requests per seconds… it is ok
- What is the timeout on X? Imagine that you received 50 requests per second… on this scenario you may need 10 seconds to answer some requests, is it ok?
- In the case of a timeout in X, the client will just retry?
- What happens if you call Y more than 5 requests per second?
- Is the response from Y is cacheable?
- Do you have multiple servers / autoscale?
One possibility is to set a rate limiter on the application to match the limit on the API.
Another is just call the API as much as you can. If it fail because too much requests you can implement a retry logic or give up.
If you need to be very careful with this API for some reason, and you don’t need to run multiple instances / autoscale, the solution is use a rate limiter on the application.
If you need to run several instances you need something that centralizes the access to this API and this is a very delicate thing… it is a single point of failure. You can implement one token system that only delivers 5 tokens per second. Once you have a token you can access the API. It is one possibility.
There is no free lunch. Each solution has pros and cons. But if you can avoid perform requests to the API (like caching the results) ou add messages to a queue if you only need to store the data (and run an async program)… perhaps will be easier to discuss q better solution