Trying to create an inverted index from a subset but not getting the appropriate return values.
return value for experiment d1 should be just [0] instead I am getting a list of both experiment and studi.
When I try to clear the new list I get empty lists as a return.enter code here
subset={'d1': ['experiment','studi','wing', 'propel', 'slipstream', 'made', 'order', 'determin', 'spanwis',
'distribut','lift', 'increas', 'due','slipstream', 'differ'],'d2':['studi','high-spe','viscou', 'flow',
'past', 'two-dimension', 'bodi','usual','necessari','consid', 'curv', 'shock', 'wave', 'emit', 'nose',
'lead', 'studi', 'bodi','.','consequ']}
set_set =['experiment','studi']
new=[]
inv_index={}
final={}
for word in set_set:
for key, values in subset.items():
for value in values:
if word == value:
new.append(values.index(word))
inv_index[key]=new
final[word]=inv_index
final
###Output
#{'experiment': {'d1': [0, 1, 0, 0], 'd2': [0, 1, 0, 0]},
#'studi': {'d1': [0, 1, 0, 0], 'd2': [0, 1, 0, 0]}}
#should be {'experiment':{'d1':[0]},'studi':{'d1':[1],'d2':[0,16]}}
#
CodePudding user response:
You're tracking a lot of stuff you don't need. Also remember that index
does not work if there are duplicates. index
always returns the index of the FIRST match.
This does what you ask:
subset={'d1': ['experiment','studi','wing', 'propel', 'slipstream', 'made', 'order', 'determin', 'spanwis',
'distribut','lift', 'increas', 'due','slipstream', 'differ'],'d2':['studi','high-spe','viscou', 'flow',
'past', 'two-dimension', 'bodi','usual','necessari','consid', 'curv', 'shock', 'wave', 'emit', 'nose',
'lead', 'studi', 'bodi','.','consequ']}
set_set =['experiment','studi']
final={}
for word in set_set:
final[word] = {}
for key, values in subset.items():
found = [idx for idx,value in enumerate(values) if word == value]
if found:
final[word][key] = found
print(final)
Output:
{'experiment': {'d1': [0]}, 'studi': {'d1': [1], 'd2': [0, 16]}}