I've written some code, but it does not output what I expected.
Here is the code:
query_words = ['dollar', 'probabilistic']
query_word_to_synonym_dict = {'probabilistic': ['probabilistic'], 'dollar' : ['currency']}
mail_ids = {123, 108}
big_ds = {}
empty_dict = {}
index = {'probabilistic':{(108, 1)}, 'currency':{(123, 1)}}
for mail_id in mail_ids:
empty_dict = dict.fromkeys(query_words, [])
big_ds.update({mail_id:empty_dict})
for query_word in query_words:
syns = query_word_to_synonym_dict[query_word]
for syn in syns:
index_of_word = index[syn]
tuple_first = []
for tuples in index_of_word:
tuple_first.append(tuples[0])
for number in tuple_first:
(big_ds[number][query_word]).append(syn)
print(big_ds)
The expected final value of big_ds
is:
{123: {'dollar': ['currency'], 'probabilistic': []}, 108: {'dollar': [], 'probabilistic': ['probabilistic']}}
But the code sets the value of big_ds
to the following:
{123: {'dollar': ['currency'], 'probabilistic': ['currency']}, 108: {'dollar': ['probabilistic'], 'probabilistic': ['probabilistic']}}
I asked a similar question a while back: How do I resolve this unexpected output in Python code? and was able to solve the issue for that use case. But that code I wrote fails when query_words
has a size>1.
I can't seem to figure out how to fix things. Any solution?
CodePudding user response:
It's because:
dict.fromkeys(query_words, [])
...the keys in each mail_id sub-dict each share the same list instance.
See:
- "Least Astonishment" and the Mutable Default Argument
- Dictionary creation with fromkeys and mutable objects. A surprise
Try this instead:
query_words = ['dollar', 'probabilistic']
query_word_to_synonym_dict = {'probabilistic': ['probabilistic'], 'dollar' : ['currency']}
mail_ids = {123, 108}
big_ds = {}
index = {'probabilistic':{(108, 1)}, 'currency':{(123, 1)}}
for mail_id in mail_ids:
big_ds[mail_id] = {word: [] for word in query_words}
for query_word in query_words:
syns = query_word_to_synonym_dict[query_word]
for syn in syns:
index_of_word = index[syn]
tuple_first = []
for tuples in index_of_word:
tuple_first.append(tuples[0])
for number in tuple_first:
big_ds[number][query_word].append(syn)
print(big_ds)