Home > Blockchain >  printing json objects in Python
printing json objects in Python

Time:07-18

My movie data has movie scripts from different script websites and basic data from IMDb website. Here, I am trying to get the first "file_name" under "files and "id" from "imdb" for each movie.

This is the first movie from my data:

{
    "10thingsihateaboutyou": {
        "files": [
            {
                "name": "10 Things I Hate About You",
                "source": "imsdb",
                "file_name": "10-Things-I-Hate-About-You",
                "script_url": "https://imsdb.com/scripts/10-Things-I-Hate-About-You.html",
                "size": 215724
            },
            {
                "name": "10 Things I Hate About You",
                "source": "screenplays",
                "file_name": "10-Things-I-Hate-About-You",
                "script_url": "https://www.screenplays-online.de/screenplay.php/119",
                "size": 130951
            },
        "imdb": {
            "title": "10 Things I Hate About You",
            "release_date": 1999,
            "id": "0147800"
        }
}

I keep getting the following error with my code below.

    file_name = data[movie]["files"]["file_name"]
TypeError: list indices must be integers or slices, not str
import json

with open('clean_meta.json') as json_file:
    data = json.load(json_file)

script_files = []
id_list = []

for movie in data:
    file_name = data[movie]["files"]["file_name"]
    i_d = data[movie]["imdb"]["id"]
    scripts_files.append(file_name)
    id_list.append(i_d)

close('clean_meta.json')

CodePudding user response:

data[movie]["files"] is a list, not a dictionary.

You'll need to loop over this list to get further information.

for movie in data:
    for file in data[movie]["files"]
        file_name = file["file_name"]
        script_files.append(file_name)
    i_d = data[movie]["imdb"]["id"]

Also, your file is already closed since you used with, so you don't need to call any close function

CodePudding user response:

{
        "10thingsihateaboutyou": {
            "files": [
                {
                    "name": "10 Things I Hate About You",
                    "source": "imsdb",
                    "file_name": "10-Things-I-Hate-About-You",
                    "script_url": "https://imsdb.com/scripts/10-Things-I-Hate-About-You.html",
                    "size": 215724
                },
                {
                    "name": "10 Things I Hate About You",
                    "source": "screenplays",
                    "file_name": "10-Things-I-Hate-About-You",
                    "script_url": "https://www.screenplays-online.de/screenplay.php/119",
                    "size": 130951
                }],
            "imdb": {
                "title": "10 Things I Hate About You",
                "release_date": 1999,
                "id": "0147800"
            }
    }

You forgot to close the square bracket for files which will inevitably throw an error.

Anyway it's a bit tricky to work with such a dictionary:

data["10thingsihateaboutyou"] is a dict

data["10thingsihateaboutyou"]["files"] is a list

data["10thingsihateaboutyou"]["files"][0] is a dict

So here you have to treat data["10thingsihateaboutyou"]["files"] as a list but you treated it as a dictionary. To access a list you can only use an integer like that:

print(data["10thingsihateaboutyou"]["files"][0]) # access first element of "files"

Output:
{'name': '10 Things I Hate About You',
 'source': 'imsdb',
 'file_name': '10-Things-I-Hate-About-You',
 'script_url': 'https://imsdb.com/scripts/10-Things-I-Hate-About-You.html',
 'size': 215724} # Note that it returns a dict

Or a slice like that:

print(data["10thingsihateaboutyou"]["files"][:]) # Access all elements in "files"

Output:
[{'name': '10 Things I Hate About You',
  'source': 'imsdb',
  'file_name': '10-Things-I-Hate-About-You',
  'script_url': 'https://imsdb.com/scripts/10-Things-I-Hate-About-You.html',
  'size': 215724},
 {'name': '10 Things I Hate About You',
  'source': 'screenplays',
  'file_name': '10-Things-I-Hate-About-You',
  'script_url': 'https://www.screenplays-online.de/screenplay.php/119',
  'size': 130951}] # Note that it returns a list of dictionaries

Beware a dictionary can also be accessed with an integer but only if the key of a key/value pair is an integer like so:

my_dict = {
1:{"name":"John", "lastname":"Doe"},
2:{"name":"Jane", "lastname":"Doe"}
}

print(my_dict[2])
Output:
{'name': 'Jane', 'lastname': 'Doe'}

Here I accessed the key named 2. In a list it would have been:

my_list = [["John", "Doe"],["Jane", "Doe"]]

print(my_list[1])
Output:
['Jane', 'Doe']

Don't forget list indexes start from 0 which means that the first item in a list is at index 0.

Also you didn't specify why you wanted the first file_name so I took the liberty of selecting file_name only if source is from "imsdb", here's how I'd do this:

for movie in data:
     file_name = data[movie]["files"]
     for entry in file_name:
         if entry["source"] == "imsdb":
             script_files.append(entry["file_name"])
     id_list.append(data[movie]["imdb"]["id"])
  • Related