Home > Mobile >  Python nested dictionary Issue when iterating
Python nested dictionary Issue when iterating

Time:11-13

I have 5 list of words, which basically act as values in a dictionary where the keys are the IDs of the documents.

For each document, I would like to apply some calculations and display the values and results of the calculation in a nested dictionary.

So far so good, I managed to do everything but I am failing in the easiest part.

When showing the resulting nested dictionary, it seems it's only iterating over the last element of each of the 5 lists, and therefore not showing all the elements...

Could anybody explain me where I am failing??

This is the original dictionary data_docs:

{'doc01': ['simpl', 'hello', 'world', 'test', 'python', 'code'],
 'doc02': ['today', 'wonder', 'day'],
 'doc03': ['studi', 'pac', 'today'],
 'doc04': ['write', 'need', 'cup', 'coffe'],
 'doc05': ['finish', 'pac', 'use', 'python']}

This is the result I am getting (missing 'simpl','hello', 'world', 'test', 'python' in doc01 as example):

{'doc01': {'code': 0.6989700043360189},
 'doc02': {'day': 0.6989700043360189},
 'doc03': {'today': 0.3979400086720376},
 'doc04': {'coffe': 0.6989700043360189},
 'doc05': {'python': 0.3979400086720376}}

And this is the code:

    def tfidf (data, idf_score): #function, 2 dictionaries as parameters
      tfidf = {} #dict for output
      for word, val in data.items(): #for each word and value in data_docs(first dict)
        for v in val: #for each value in each list
          a = val.count(v) #count the number of times that appears in that list
          scores = {v :a * idf_score[v]} # dictionary that will act as value in the nested
        tfidf[word] = scores #final dictionary, the key is doc01,doc02... and the value the above dict
      return tfidf
    
    tfidf(data_docs, idf_score)

Thanks,

CodePudding user response:

Did you mean to do this?

def tfidf(data, idf_score):  # function, 2 dictionaries as parameters
    tfidf = {}  # dict for output
    for word, val in data.items():  # for each word and value in data_docs(first dict)
        scores = {}  # <---- a new dict for each outer iteration
        for v in val:  # for each value in each list
            a = val.count(v)  # count the number of times that appears in that list
            scores[v] = a * idf_score[v] # <---- keep adding items to the dictionary
        tfidf[word] = scores  # final dictionary, the key is doc01,doc02... and the value the above dict
    return tfidf

... see my changes with <----- arrow :) Returns:

{'doc01': {'simpl': 1,
  'hello': 1,
  'world': 1,
  'test': 1,
  'python': 1,
  'code': 1},
 'doc02': {'today': 1, 'wonder': 1, 'day': 1},
 'doc03': {'studi': 1, 'pac': 1, 'today': 1},
 'doc04': {'write': 1, 'need': 1, 'cup': 1, 'coffe': 1},
 'doc05': {'finish': 1, 'pac': 1, 'use': 1, 'python': 1}}
  • Related