Home > Mobile >  How do I extract all keys from this JSON file?
How do I extract all keys from this JSON file?

Time:06-03

The data I'm working with is the entire list of yugioh cards, found at this endpoint: https://db.ygoprodeck.com/api/v7/cardinfo.php

There's a top-level data node that I can ignore, but each sub-node immediately after that (0, 1, 2,...) is a unique card in the cardset. I want to find out how many unique keys there are in this entire dataset. The tricky part is that each card could have different sub-nodes than another card. Eventually I want to put this all into SQL tables, but for now I need to know all of the keys. Some examples of the keys are id, name, archetype, atk, def, and card_sets. How do I extract a unique list of all keys? I'm looking for the easiest way to get this list. I have experience in Python and T-SQL, but any other language is fine since my goal is to just look at the list.

CodePudding user response:

I used generators to solve this problem
If the data is dict, it's keys will be yield
If the data is list or tuple, it's elements continue to be parsed
String are also iterable and need to be excluded

import json
import requests
from collections import Iterable


def get_key(data):
    if isinstance(data, dict):
        for k, v in data.items():
            yield k
            yield from get_key(v)
    elif isinstance(data, Iterable) and not isinstance(data, str):
        for i in data:
            yield from get_key(i)


def main():
    url = "https://db.ygoprodeck.com/api/v7/cardinfo.php"
    res = requests.get(url)
    data = json.loads(res.text)["data"]
    result = set(get_key(data))
    print(result)


if __name__ == '__main__':
    main()

And the output

{'set_rarity', 'def', 'image_url_small', 'ban_tcg', 'set_rarity_code', 'linkmarkers', 'type', 'coolstuffinc_price', 'cardmarket_price', 'id', 'image_url', 'level', 'name', 'set_code', 'banlist_info', 'desc', 'set_name', 'card_prices', 'attribute', 'linkval', 'ebay_price', 'tcgplayer_price', 'card_sets', 'atk', 'set_price', 'ban_ocg', 'ban_goat', 'archetype', 'amazon_price', 'scale', 'race', 'card_images'}
  • Related