nested dictionary TypeError: 'NoneType' object is not subscriptable-CodePudding

I have a dictionary - questions. the key is a number and the value is another dictionary. here is an example of the structure:

    questions = {
    2313: {"question": "How much is 2 2", "answers": ["3", "4", "2", "1"], "correct": 2},
    4122: {"question": "What is the capital of France?", "answers": ["Lion", "Marseille", "Paris", "Montpellier"],
           "correct": 3}
}

I need to add more questions to the dictionary from a text file ('questions.txt') where the questions look like that:
0#What is the capital city of USA?#Washington DC#New York#Los Angeles#Detroit#1
After I opened the file, I created a loop to go through all the questions and add them to the dictionary. I used a function from my protocol 'chatlib':

def split_data(data, expected_fields):

    splitted = data.split('#')
    if len(splitted) == expected_fields:
        return splitted
    else:
        return

so for example when I use it on the question before it will return it as a list that looks like that:
['0', 'What is the capital city of USA?', 'Washington DC', 'New York', 'Los Angeles', 'Detroit', '1']
I tried a lot of different ways to write the main code,

    list_new_questions = open("questions.txt").read().split('\n')
    for question in list_new_questions:
        questionlist = chatlib.split_data(question, 7)
        key = int(questionlist[0])
        questions[key] = {"question": "", "answers": [], "correct": 0}
        questions[key]["question"] = questionlist[1]
        questions[key]["answers"] = [questionlist[2], questionlist[3], questionlist[4], questionlist[5]]
        questions[key]["correct"] = int(questionlist[6])

but every time it returns an error (TypeError: 'NoneType' object is not subscriptable) and says that the value of int(questionlist[0]) is None but I don't understand why. How can it be None, it is supposed to be the int value of the first element in the list questionlist which is always a number. every time I print int(questionlist[0]) it always prints a number so I don't understand why it says it's None.

CodePudding user response：

when you return out of line 3 questionlist = chatlib.split_data(question, 7) your function can return None if the len(splitted)` is not == to 7.

so questionlist = None when the there are not 7 fields and then in line 4 you are trying to take the index 0 of None.

easy fix & restructure would be something like:

def split_data(data):
    # validate data here
    splitted = data.split("#")
    # validate splitted here if you need to
    return splitted


list_new_questions = open("questions.txt").read().split("\n")
for question in list_new_questions:
    questionlist = chatlib.split_data(question)
    # do your validation here
    if len(questionlist) == 7:
        # try except maybe if you are unsure about the data split you get
        key = int(questionlist[0])
        questions[key] = {
            "question": questionlist[1], 
            "answers": questionlist[2:6], 
            "correct": int(questionlist[6)
        }
    else: 
        print("Invalid question format")

CodePudding user response：

Let's look at the error message:

TypeError: 'NoneType' object is not subscriptable

Subscript notation is when you use square brackets like so: X[Y]. On the line that raises the exception, the only place where that happens is questionlist[0]. You cannot subscribe or index None because what would None[0] even mean? So that tells you that questionlist is None.

We can figure out why that is the case by looking at the previous line,

questionlist = chatlib.split_data(question, 7)

So what does split_data return?

Either return splitted, which is a list of string or return, which is the same as return None. So we know the latter has to be have happened, which means you've come across a line with an unexpected number of #s.

This could have happened because 1) there is a typo somewhere, because 2) you have an empty line in the file, or because 3) the file ends with a newline (automatically inserted by some editors).

The next question is, what do you want to happen in those cases? If you just want to ignore any non-matching line, Yash's solution will help you.

But I think while allowing 3 is definitely good and 2 is probably good for your use-case as well, I don't think you should want to just carry on like nothing's wrong in the case of 1. If there's a typo, I think you want your program to complain loudly... but maybe with a more useful error message.

You could do this without modifying split_data at all. Here's a possible solution:

    list_new_questions = open("questions.txt").read().split('\n')
    for question in list_new_questions:
        if question.strip() == '':
            # Ignore lines with only whitespace
            continue
        questionlist = chatlib.split_data(question, 7)
        if questionlist is None:
            # Not the right number of fields!
            raise Exception(f"Invalid question format: {question!r}")
        key, question, *answers, correct = questionlist
        questions[int(key)] = {"question": question, "answers": answers, "correct": int(correct)}

CodePudding user response：

Your split_data function can return a None value if the actual fields is != expected_fields.

You can do it like below which can also account for if you start providing more answers.

This also removes the need for the split_data function as it can be done as part of the loop.

with open("questions.txt") as f:
    new_questions = f.readlines()

    for q in new_questions:
        q = q.split("#")
        if len(q) == 7:
            questions[int(q[0])] = {
                "question": q[1],
                "answers": q[2:-1],
                "correct": int(q[-1])}
        else:
            print(f"Invalid Data {q}")

If you wanted to provide more or less than 4 answers you could do something like below which only assumes the id is the first field, question is the second and answer is the last

for q in new_questions:
    q = q.split("#")
    try:
        questions[int(q[0])] = {
            "question": q[1],
            "answers": q[2:-1],
            "correct": int(q[-1])}
    except IndexError:
        print(f"Invalid data {q}")