Home > Software design >  Why am I getting TypeError on code that worked previously?
Why am I getting TypeError on code that worked previously?

Time:03-27

I have this code to iterate through a json file. The user specifies tiers to be extracted, the names of which are then saved in inputLabels, and this for loop extracts the data from those tiers:

with open(inputfilename, 'r', encoding='utf8', newline='\r\n') as f:
        data = json.load(f)
        for line in data:
            if line['label'] in inputLabels:
                elements = [(e['body']['value']).replace(" ", "_")   "\t" for e in line['first']['items']]
                outputData.append(elements)

I wrote this code a year ago and have run it multiple times since then with no issues, but running it today I received a TypeError.

    if line['label'] in inputLabels:
TypeError: string indices must be integers

I don't understand why my code was able to work before if this is a true TypeError. Why is this only a problem in the code now, and how can I fix it?

EDIT: Pasted part of the json:

{
  "contains": [
    {
      "total": 118,
      "generated": "ELAN Multimedia Annotator 6.2",
      "id": "xxx",
      "label": "BAR001_TEXT",
      "type": "AnnotationCollection",
      "@context": "http://www.w3.org/ns/ldp.jsonld",
      "first": {
        "startIndex": "0",
        "id": "xxx",
        "type": "AnnotationPage",
        "items": [
          {
            "id": "xxx",
            "type": "Annotation",
            "body": {
              "purpose": "transcribing",
              "format": "text/plain",
              "language": "",
              "type": "TextualBody",
              "value": ""
            },
        "@context": "http://www.w3.org/ns/anno.jsonld",
        "target": {
          "format": "audio/x-wav",
          "id": "xxx",
          "type": "Audio"
        }
      },
      {
        "id": "xxx",
        "type": "Annotation",
        "body": {
          "purpose": "transcribing",
          "format": "text/plain",
          "language": "",
          "type": "TextualBody",
          "value": "Dobar vam"
        },
        "@context": "http://www.w3.org/ns/anno.jsonld",
        "target": {
          "format": "audio/x-wav",
          "id": "xxx",
          "type": "Audio"
        }
      },
      {
        "id": "xxx",
        "type": "Annotation",
        "body": {
          "purpose": "transcribing",
          "format": "text/plain",
          "language": "",
          "type": "TextualBody",
          "value": "Je"
        },
        "@context": "http://www.w3.org/ns/anno.jsonld",
        "target": {
          "format": "audio/x-wav",
          "id": "xxx",
          "type": "Audio"
        }
      },

CodePudding user response:

Your code would probably work if you replaced for line in data: with for line in data['contains']

Maybe the JSON schema didn't have the "contains" level previously.

CodePudding user response:

A pretty pythonic approach would be using exceptions:

with open(inputfilename, 'r', encoding='utf8', newline='\r\n') as f:
        data = json.load(f)
        for line in data:
            try:
                if line['label'] in inputLabels:
                    elements = [(e['body']['value']).replace(" ", "_")   "\t" for e in line['first']['items']]
                    outputData.append(elements)
            except Exception as e:
                print( f"{type(e)} : {e} when trying to use {line}")

Your code will run through and give you a hint about what failed

CodePudding user response:

Turns out it was a pretty simple fix. All of the JSON file was in a container (look at the portion I posted in the question, it's the second line, "contains":). I was able to just remove that container and its open/closing brackets and the code ran successfully after that. Thanks all for your help.

  • Related