Home > Software engineering >  Modify existing json to create new custom one python
Modify existing json to create new custom one python

Time:08-06

Hi Im trying to trim unused data in json to create new one with only two fields. Title and description. The title works great but I can't figure out how to get the description field. The json is public and you can get it here or at the end of the post

my code that extracts title field

import requests
import json

def trim_json(d):
    newd = {}
    for name in ['title']:
        newd[name] = d[name]
    return newd

def clean():
    books = requests.get('https://openlibrary.org/authors/OL23919A/works.json')
    books_parsed = books.json()
    book_data = books_parsed['entries']
  
    book_data = [trim_json(d) for d in book_data]
    print(book_data)
    return book_data

update

clean function returns list of dicts in this format

[{'title': 'Harry Potter House Gryffindor Edition Series 1-5 Books Collection Set By J.K. Rowling'}]

what I want to get is

[{'title': 'Harry Potter House Gryffindor Edition Series 1-5 Books Collection Set By J.K. Rowling', 'description': 'lorem ipsum'}]

and if there is no description

[{'title': 'Harry Potter House Gryffindor Edition Series 1-5 Books Collection Set By J.K. Rowling', 'description': 'undefind'}]

How can i get json that returns title & description

    {
        "type": {
            "key": "/type/work"
        },
        "title": "Journey to Hogwarts",
        "authors": [
            {
                "type": {
                    "key": "/type/author_role"
                },
                "author": {
                    "key": "/authors/OL23919A"
                }
            }
        ],
        "covers": [
            2520429
        ],
        "key": "/works/OL28602152W",
        "latest_revision": 1,
        "revision": 1,
        "created": {
            "type": "/type/datetime",
            "value": "2022-08-05T00:16:59.602176"
        },
        "last_modified": {
            "type": "/type/datetime",
            "value": "2022-08-05T00:16:59.602176"
        }
    },
    {
        "description": "Harry Potter #2\r\n\r\nThroughout the summer holidays after his first year at Hogwarts School of Witchcraft and Wizardry, Harry Potter has been receiving sinister warnings from a house-elf called Dobby.\r\n\r\nNow, back at school to start his second year, Harry hears unintelligible whispers echoing through the corridors.\r\n\r\nBefore long the attacks begin: students are found as if turned to stone.\r\n\r\nDobby’s predictions seem to be coming true.\r\n\r\n[Source][1]\r\n\r\n\r\n  [1]: https://www.jkrowling.com/book/harry-potter-chamber-secrets/",
        "links": [
            {
                "title": "Author's book page",
                "url": "https://www.jkrowling.com/book/harry-potter-chamber-secrets/",
                "type": {
                    "key": "/type/link"
                }
            },
            {
                "url": "https://en.wikipedia.org/wiki/Harry_Potter_and_the_Chamber_of_Secrets",
                "title": "Wikipedia entry",
                "type": {
                    "key": "/type/link"
                }
            },
            {
                "title": "Harry Potter and the Chamber of Secrets by J.K. Rowling - review | Children's books | The Guardian",
                "url": "https://www.theguardian.com/childrens-books-site/2015/mar/02/review-j-k-rowling-harry-potter-chamber-secrets",
                "type": {
                    "key": "/type/link"
                }
            },
            {
                "url": "https://www.theguardian.com/childrens-books-site/2016/may/26/harry-potter-and-the-chamber-of-secrets-jk-rowling-review",
                "title": "Harry Potter and the Chamber of Secrets by J.K. Rowling - review 2 | Children's books | The Guardian",
                "type": {
                    "key": "/type/link"
                }
            }
        ],
        "title": "Harry Potter and the Chamber of Secrets",
        "covers": [
            8234423,
            8237628,
            8237644,
            8392798,
            8995302,
            8762432,
            8081272,
            8353396,
            10301720,
            8938317,
            10471286,
            10413455,
            10487260,
            -1,
            10535729,
            10722535,
            10722534,
            11522289,
            12347254,
            12581306,
            12606939,
            10536577,
            11540339,
            12023623
        ],
        "subject_places": [
            "England",
            "London",
            "Hogwarts School of Witchcraft and Wizardry",
            "Inglaterra",
            "Privet Drive"
        ],
        "subjects": [
            "Fantasy fiction",
            "school stories",
            "Fiction",
            "Fantasy",
            "Nestlé Smarties Book Prize winner",
            "Juvenile fiction",
            "Wizards",
            "Magic",
            "Schools",
            "Spanish language materials",
            "Magia",
            "Escuelas",
            "Ficción juvenil",
            "Novela fantástica",
            "Hogwarts School of Witchcraft and Wizardry (Imaginary place)",
            "Harry Potter (Fictitious character)",
            "Wizards -- Juvenile fiction",
            "Witches",
            "Hogwarts School of Witchcraft and Wizardry (Imaginary organization)",
            "Magos",
            "Translations from English",
            "Chinese fiction",
            "Orphans",
            "Aunts",
            "Uncles",
            "Cousins",
            "Determination (Personality trait) in children",
            "Friendship",
            "Potter, Harry (Fictitious character)",
            "Witches Fiction",
            "Wizards Fiction",
            "Schools Fiction",
            "England Fiction",
            "Magic -- Juvenile fiction",
            "Hogwarts School of Witchcraft and Wizardry (Imaginary place) -- Juvenile fiction",
            "Schools -- Juvenile fiction",
            "Wizards -- Fiction",
            "Magic -- Fiction",
            "Schools -- Fiction",
            "England -- Juvenile fiction",
            "England -- Fiction",
            "Fantasy & Magic",
            "Action & Adventure",
            "Witchcraft",
            "Harry Potter (Fictional character)",
            "Engels",
            "Social Themes",
            "Reading Level-Grade 11",
            "Reading Level-Grade 12",
            "Schools, fiction",
            "England, fiction",
            "Potter, harry (fictitious character), fiction",
            "Hogwarts school of witchcraft and wizardry (imaginary organization), fiction",
            "Wizards, fiction",
            "Magic, fiction",
            "Children's fiction",
            "Adventure and adventurers, fiction",
            "English literature",
            "Fiction, fantasy, general",
            "Large type books",
            "Hermione Granger (Fictitious character)",
            "Ron Weasley (Fictitious character)",
            "Latin language materials",
            "Children's stories",
            "Magiciens",
            "Romans, nouvelles, etc. pour la jeunesse",
            "Nécromancie",
            "Écoles",
            "Potter, Harry (Personnage fictif)",
            "Romans, nouvelles",
            "Magie",
            "Family",
            "Orphans & Foster Homes",
            "Magía",
            "Novela juvenil",
            "Juvenile",
            "Children's stories, English",
            "Sieg",
            "Basilisk",
            "Das Böse",
            "Das Gute",
            "Internat",
            "Lebensgefahr",
            "Lebensrettung",
            "List",
            "Magier",
            "Jugendbuch",
            "Kampf",
            "Schule",
            "Basilisk (Fabeltier)",
            "Junge",
            "Phönix",
            "Deutschland Grenzschutzkommando Mitte Schule",
            "Deutschland",
            "Friendship, fiction",
            "Hogwartes School of Witchcraft and Wizardry (Imaginary place)",
            "General",
            "Social Issues",
            "Witches, fiction"
        ],
        "subject_people": [
            "Harry Potter",
            "Hermione Granger",
            "Ron Weasley",
            "Albus Dumbledore",
            "Hagrid",
            "The Dursleys",
            "Gilderoy Lockhart",
            "Dobby",
            "Moaning Myrtle",
            "Ginny Weasley",
            "Draco Malfoy",
            "Hermine Granger",
            "Ron Weasly",
            "Harry Potter (Fictitious character)"
        ],
        "key": "/works/OL82537W",
        "authors": [
            {
                "author": {
                    "key": "/authors/OL23919A"
                },
                "type": {
                    "key": "/type/author_role"
                }
            }
        ],
        "excerpts": [
            {
                "excerpt": "Not for the first time, an argument had broken out over breakfast at number four, Privet Drive.",
                "comment": "first sentence",
                "author": {
                    "key": "/people/seabelis"
                }
            }
        ],
        "type": {
            "key": "/type/work"
        },
        "latest_revision": 80,
        "revision": 80,
        "created": {
            "type": "/type/datetime",
            "value": "2009-10-17T07:07:29.461716"
        },
        "last_modified": {
            "type": "/type/datetime",
            "value": "2022-06-22T07:57:49.863271"
        }
    },

CodePudding user response:

All entries don't have title and description field. Therefore you have to use try...except clauses to prevent KeyErrors to happen.

def trim_json(d):
    newd = {}
    try:
        newd["title"] = d["title"]
    except KeyError:
        pass
    try:
        newd["description"] = d["description"]
    except KeyError:
        pass
    return newd

Or, in a more elegant way, you could use a filter in a dictionnary comprehension:

 key_filter = ['title', 'description']
 cleaned_data = [{k:d[k] for k in key_filter if k in d} for d in book_data]

And since the first element in the entries list is not a book data (and does not have a title nor a description key), you should start the list comprehension after the first element :

def clean():
    books = requests.get('https://openlibrary.org/authors/OL23919A/works.json')
    books_parsed = books.json()
    book_data = books_parsed['entries']
    cleaned_data = [trim_json(d) for d in book_data[1:]]
    return book_data

It prevents obtaining an empty dictionnary that corresponds to no book.

CodePudding user response:

Use the json library. It comes installed in python by default.

Let us say your json string is stored in a variable called json_str, we can run:

import json

info = json.loads(json_str)
title = info['title']

Happy debugging :)

  • Related