How to specify what is included in the JSON response?-CodePudding

I am getting information from an API using Python requests with the following code:

import json
import requests

resp = requests.get('https://api.oxoservices.eu/api/v1/startups?site=labs&startup_status=funded')

json_resp = json.loads(resp.text)


for company in json_resp['data']:

    
    print(json.dumps(company, indent=4))
    print()
    with open("test.json", "w", encoding='utf-8') as file:
        # file.write(str(json_resp))
        json.dump(json_resp, file, indent=4, sort_keys=True)

It extracts all needed information, and a lot of not needed information as well, which is my problem.

I get the output:

"data": [
    {
        "cover": null,
        "cover_id": null,
        "created_at": "2021-01-05T05:56:03.000000Z",
        "focus": {
            "color": "#25c9b6",
            "created_at": "2016-06-15T10:46:50.000000Z",
            "id": 15,
            "is_active": true,
            "name": "Financial Technologies",
            "updated_at": "2016-06-15T10:46:50.000000Z"
        },
        "focus_id": 15,
        "id": 1111,
        "irr": 0,
        "is_active": false,
        "name": "iconicchain",
        "photo": {
            "created_at": "2021-11-15T17:16:17.000000Z",
            "filename": "iconicchain.png",
            "id": "52b7c33f-c74c-4099-88cb-944b4047cf85",
            "mime": "image/png",
            "size": 14056,
            "type": "photo",
            "url": "/attachments/52b7c33f-c74c-4099-88cb-944b4047cf85"
        },
        "photo_id": "52b7c33f-c74c-4099-88cb-944b4047cf85",
        "raised_type": {
            "id": 3,
            "key": "seed",
            "name": "Seed"
        },
        "startup_investment_type": {
            "id": 1,
            "key": "none",
            "name": "Not seeking"
        },
        "startup_stage_id": 4,
        "startup_status": {
            "id": 5,
            "key": "funded",
            "name": "Funded"
        },
        "startup_valuation_basis": {
            "id": 3,
            "key": "next_funding_round",
            "name": "Next funding round"
        },
        "summary": "Compliance based on facts, not faith-delivering regulatory compliance automation solutions for the financial sector.",
        "video_id": null,
        "video_type_id": "1",
        "website": "https://www.iconicchain.com"

From the data I would only like to extract the website, which in this case would be https://www.iconicchain.com, and only the name iconchain at the top.

CodePudding user response：

First of all, if you're using requests to pull data from a JSON api, you don't need to import json package as well, requests by itself will parse json just fine. You can do what you need with requests and pandas only:

import requests
import pandas as pd

r = requests.get('https://api.oxoservices.eu/api/v1/startups?site=labs&startup_status=funded')
df = pd.DataFrame(r.json()['data'])
df = df[['name', 'website']]
print(df)

This will return:

name    website
0   iconicchain https://www.iconicchain.com
1   Gloster Nyrt.   https://gloster.hu/
2   Vilhemp https://vilhemp.hu
3   HackRate    https://hckrt.com/
4   Commsignia  http://www.commsignia.com
5   BitNinja    https://bitninja.io
[...]