Home > Software engineering >  JSON decoder stops after roughly 1000 iterations
JSON decoder stops after roughly 1000 iterations

Time:05-12

I wrote a small programm that iterates over 1500 JSON files, parses them and inserts the records in a MySQL database. But after roughly 1000 iterations my python script stops with the following error:

Traceback (most recent call last):
  File "/home/ubuntu/myprogram/main.py", line 107, in <module>
    exec(open("/home/ubuntu/myprogram/subprogramthatscalled.py").read())
  File "<string>", line 32, in <module>
  File "/usr/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I double checked all the JSON files and found no errors.

Here is the code I used:

path = "/home/ubuntu/mypath"
path_list = os.listdir(path)

for file in path_list:
        if file.startswith("stuff") and file.endswith(".json"):
                each_file = path   file
                json_file = open(each_file)
                json_eingelesen = json.load(json_file)
                json_objects = len(json_eingelesen['data']['children'])
                for k in range(json_objects):
                        keys = ""
                        values = ""
                        for (k, v) in json_eingelesen['data']['children'][k]['data'].items():
                                keys  = str(k)   ", "
                                #here was some little code to prepare kinds of variables
                                values  = "'"   str(v)   "', "
                        keys = keys[:-2]
                        values = values[:-2]
                        query = "INSERT INTO reddit_subreddit_posts (%s) VALUES (%s)"%(keys, 
                        values) #I know that SQLinjection is easy here but my raspberry is 
                                 not available from outside my network, Ill secure that later
                        queries.execute(query)
                        mydb.commit()
                os.remove(each_file)

Now the fun part: When I restart the script it runs without an error.

So my question is: Are there any restrictions stopping the python JSONDecoder or any other part of my code from iterating over the rest of the JSON files?

Here are the inital bytes in hex: 7b 22 6b 69 6e 64 22 3a

CodePudding user response:

You need to add try ... except mechanism to your code-line json.load(json_file) in order to catch these errors.

Probably, the data in json_file variable is not a JSON string in one of your files.

CodePudding user response:

You probably won't see the error in the file because it is invisible.

The error at character 0, so most likely it is a BOM character.

Byte-Order Mark, which is inserted by some text editors when saving UTF8 files.

You can edit the file to remove the BOM, or set encoding to 'utf-8-sig' when opening the file.

json_file = open(each_file, encoding='utf-8-sig')
  • Related