Home > Software design >  JSON - How to return the location of an error?
JSON - How to return the location of an error?

Time:06-07

When I try to read a JSON file into Python using Python's built in package json, I get back a JSONDecodeError that looks something like this:

JSONDecodeError: Expecting value: line 1 column 233451 (char 233450)

Is there any way to return the location of the error (in this case, 233450)? What I want is something like:

try:
    json.loads(my_json)
except:
    error_loc = json.get_error(my_json)

where error_loc = 233450 - or even just the entire error message as a string, I can extract the number myself.

Context: I'm trying to load some very poorly formatted (webscraped) JSONs into Python. Many of the errors are related to the fact that the text contained in the JSONs contains quotes, curly brackets, and other characters that the json reader interprets as formatting - e.g.

{"category": "this text contains "quotes", which messes with the json reader",
"category2": "here's some more text containing "quotes" - and also {brackets}"},
{"category3": "just for fun, here's some text with {"brackets and quotes"} in conjunction"}

I managed to eliminate the majority of these situations using regex, but now there's a small handful of cases where I accidentally replaced necessary quotes. Looking through the JSONs manually, I don't actually think it's possible to catch all the bad formatting situations without replacing at least one necessary character. And in almost every situation, the issue is just one missing character, normally towards the very end...

If I could return the location of the error, I could just revert the replaced character and try again.

I feel like there has to be a way to do this, but I don't think I'm using the correct terms to search for it.

CodePudding user response:

You can catch the error as the variable error by except json.decoder.JSONDecodeError as error. Then, the JSONDecodeError object has an attribute pos, that gives the index in the string which the JSON decoding error. lineno and colno can be used to get line number and column number like when opening a file graphically in an editor.

try:
    json.loads(string_with_json)
except json.decoder.JSONDecodeError as error:
    error_pos = error.pos
    error_lineno = error.lineno
    error_colno = error.colno
  • Related