Home > Software design >  Get large data from API with pagination
Get large data from API with pagination

Time:09-08

I'm trying to GET a large amount of data from the API (over 300k records). It has pagination (25 records per page) and request limit is 50 request per 3 minutes. I'm using PHP curl to get the data. The API needs JWT token authorization. I can get a single page and put its records into an array.

...
$response = curl_exec($curl);
curl_close($curl);
$result = json_decode($response, true);

The problem is I need to get all records from all pages and save it into array or file. How to do it? Maybe I should use JS to do it better?

Best regards and thank you.

CodePudding user response:

Ideally use cron and some form of storage, database or a file.

It is important that you ensure a new call to the script doesn't start unless the previous one has finished, otherwise they start stacking up and after a few you will start having server overload, failed scripts and it gets messy.

  1. Store a value to say the script is starting.

  2. Run the CURL request.

  3. Once curl has been returned and data is processed and stored change the value you stored at the beginning to say the script has finished.

  4. Run this script as a cron in the intervals you deem necessary.

A simplified example:

    <?php 

    if ($script_is_busy == 1) exit();

    $script_is_busy = 1;

    // YOUR CURL REQUEST AND PROCESSING HERE
   
    $script_is_busy = 0;

    ?>

CodePudding user response:

JS is not better.

This is not that difficult.

Why are you not sending all the data in the response to the request?

To keep the 25 record pagination just break the response into pages of 25 records each.

I fail to see where there is a problem.

You are not very clear about the requests. You say they can only make 50 requests each 3 minutes. You do indicate how many records can per request.
Or did you mean 50 records each 3 minutes?

You may want to rethink you policy. The easiest thing for you is to return as many records as requested in the first request.

So what do you need help with?

  • How to return the records and maintain 25 record pagination?
  • To enforce your limitations?
  • How to monetize your service

I do not understand your problem. It sounds like you created the problem by imposing limitations. Where these limitations only make things more difficult for you. Is there money involved? Why are you trying to make things more difficult? For yourself an the user? Do you have a service that will be in very high demand?

  • Related