I'm trying to get the json data from every page on an API and put that into one big json output.
(Docs for API i'm using: https://docs.scoresaber.com/#/Leaderboards/get_api_leaderboards)
When doing the following API call:
https://scoresaber.com/api/leaderboards?qualified=true&withMetadata=true
i get the object metadata
which has total
and itemsPerPage
Example:
"metadata": {
"total": 193,
"page": 1,
"itemsPerPage": 14
}
So 193/14 means i get 14 pages.
This means i can iterate through all pages by doing a request for each page with this API call: https://scoresaber.com/api/leaderboards?qualified=true&page=2
until i get to &page=4
Each page will result this json (trimmed example):
{
"leaderboards": [
{
"id": 466447,
"songHash": "E527C82AF2DEC46A23F12D742035D76CCA875904",
"songName": "Parasite",
"songSubName": "(feat. Hatsune Miku)",
"songAuthorName": "DECO*27",
"levelAuthorName": "Alice",
"difficulty": {
"leaderboardId": 466447,
"difficulty": 1,
"gameMode": "SoloStandard",
"difficultyRaw": "_Easy_SoloStandard"
},
"maxScore": 0,
"createdDate": "2022-06-01T17:16:52.000Z",
"rankedDate": null,
"qualifiedDate": "2022-06-14T05:53:21.000Z",
"lovedDate": null,
"ranked": false,
"qualified": true,
"loved": false,
"maxPP": -1,
"stars": 0,
"plays": 70,
"dailyPlays": 0,
"positiveModifiers": false,
"playerScore": null,
"coverImage": "https://cdn.scoresaber.com/covers/E527C82AF2DEC46A23F12D742035D76CCA875904.png",
"difficulties": null
},
],
"metadata": {
"total": 193,
"page": 2,
"itemsPerPage": 14
}
}
So what i want is to loop through all the pages and have every item in leaderboards
into one json.
This is what I've tried:
import requests
import math
import json
response = requests.get("https://scoresaber.com/api/leaderboards?qualified=true&withMetadata=true")
api = json.loads(response.text)
pages = math.ceil(api['metadata']['total'] / api['metadata']['itemsPerPage'])
api = {}
for page in range(1, pages 1):
api.update(json.loads(requests.get(f"https://scoresaber.com/api/leaderboards?qualified=true&page={page}").text))
api = json.dumps(api, indent=4)
But that seems to only get the last page and just overwrite the dictionary (i'm also not sure if i need to declare api
as a dict.
So i'm just not sure what is going wrong, if im declaring stuff wrongly, if im requesting the api wrongly, or if im putting stuff wrongly into the dict, etc. Any help would be much appreciated.
CodePudding user response:
If I understand you correctly you want to receive all data to one big list:
import json
import math
import requests
url1 = (
"https://scoresaber.com/api/leaderboards?qualified=true&withMetadata=true"
)
url2 = "https://scoresaber.com/api/leaderboards?qualified=true&page={}"
api = requests.get(url1).json()
pages = math.ceil(api["metadata"]["total"] / api["metadata"]["itemsPerPage"])
all_data = []
for page in range(1, pages 1):
data = requests.get(url2.format(page)).json()
all_data.extend(data["leaderboards"])
print(json.dumps(all_data, indent=4))
This will print all 193 items from all pages:
[
{
"id": 484864,
"songHash": "80559A7A4AC0F62F27DAF1C59DF67F305250ADFF",
"songName": "Phony",
"songSubName": "feat. KAFU (Hoshimachi Suisei Cover)",
"songAuthorName": "Tsumiki",
"levelAuthorName": "Joshabi & Shad",
...