Home > Mobile >  How do I iterate through and append the data from multiple pages with an API request?
How do I iterate through and append the data from multiple pages with an API request?

Time:07-10

I'm using an Indeed API from Rapid API to collect job data. The code snippet provided only returns results for 1 page. I was wondering how to set up a for loop to iterate through multiple pages and append the results together.

url = "https://indeed11.p.rapidapi.com/"


payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

As seen in the code above, the key "page" is set to a value of 1. How would I parameterize this value, and how would I construct the for loop while appending the results from each page? Thank you so much.

CodePudding user response:

You can make the pagination with the help of payload along with for loop and range function

import requests

url = "https://indeed11.p.rapidapi.com/"

payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}
for page in range(1,11):
    payload['page'] = page

    response = requests.post(url, json=payload, headers=headers)

CodePudding user response:

You can try this:

max_page = 100
result = {}
for i in range(1, max_page   1):
    try:
        payload.update({'page': i})
        
        if i not in result:
            result[i] = requests.request("POST", url, json=payload, headers=headers)
            
    except:
        continue

CodePudding user response:

I think that you could do this with a while loop. To implement this, you would need code to detect when there are no more pages to read, but it's probably possible. Here's what I would do:

url = "https://indeed11.p.rapidapi.com/"

payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}

responses = []
while not no_more_pages(): # no_more_pages() is a placeholder for code that detects when there are no more pages to read
    responses.append(requests.request("POST", url, json=payload, headers=headers))
    payload['page']  = 1

Once the loop is done, you could use the responses list to access the data.

  • Related