Home > other >  API pagination loop
API pagination loop

Time:08-25

I have successfully created a loop to paginate an API I am working with. My challenge is on concatenating the dataframes once I am done with the loop so that I have one solid dataframe. Any help will go a long way.

import requests
import json
import pandas as pd
from pandas import json_normalize

url = 'https://vendors.paddle.com/api/2.0/subscription/users'
headers = {
    'Content-Type': 'application/x-www-form-urlencoded',
    'User-Agent': 'PostmanRuntime/7.29.2',
    'Accept': '*/*',
    'Accept-Encoding': 'gzip',
    'Connection': 'keep-alive'}

page = 1
data_nested = []

while data_nested is not None:
    data = [('vendor_id', xxxxx),('vendor_auth_code','4803155ec5f5a17d589b650cxxxxxxxxx'),('results_per_page',200),('page',page)]
    response = requests.post(url, headers=headers , data = data)
    data_nested = response.json()['response']
    data_flattened = pd.json_normalize(data_nested)
    df = pd.DataFrame.from_dict(data_flattened)
    print(df)
    if len(df.index)==0:
        break
    page  = 1

CodePudding user response:

If I'm reading this correctly you are currently just printing them out? You could do something like this if I am understanding what you want correctly. Then print it out as one big df at the end.

page = 1
data_nested = []
loop = []

while data_nested is not None:
    data = [('vendor_id', xxxxx),('vendor_auth_code','4803155ec5f5a17d589b650cxxxxxxxxx'),('results_per_page',200),('page',page)]
    response = requests.post(url, headers=headers , data = data)
    data_nested = response.json()['response']
    data_flattened = pd.json_normalize(data_nested)
    df = pd.DataFrame.from_dict(data_flattened)
    loop.append(df)
    if len(df.index)==0:
        break
    page  = 1   
print(loop)

CodePudding user response:

And to end up with a dataframe, I adjusted Aaron Cloud's solution to:

page = 1
data_nested = []
loop = []

while data_nested is not None:
    data = [('vendor_id', xxxxx),('vendor_auth_code','4803155ec5f5a17d589b650cxxxxxxxxx'),('results_per_page',200),('page',page)]
    response = requests.post(url, headers=headers , data = data)
    data_nested = response.json()['response']
    data_flattened = pd.json_normalize(data_nested)
    df = pd.DataFrame.from_dict(data_flattened)
    loop.append(df)
    if len(df.index)==0:
        break
    page  = 1   
pd.concat(loop)
  • Related