I am new to using for loops and am trying to create an acronym lookup. There is more to the code after which runs well, but when I try to lookup an acronym that I have added to the lists it hasn't been added. It may be a simple syntax error or a larger issue.
import requests
data_site = requests.get('https://raw.githubusercontent.com/priscian/nlp/master/OpenNLP/models/coref/acronyms.txt')
text_line = data_site.text.split('\n')
acronyms_dict = {}
for line in text_line:
one_line = line.split(' ',1)
if len(one_line) > 1:
acro = one_line[0].strip()
meaning = one_line[1].strip()
if acro in acronyms_dict:
acronyms_dict[acro] = acronyms_dict[acro] ', ' meaning
else:
acronyms_dict[acro] = meaning
A big problem I had at first was defining where I wanted the split. The way I believe it is currently set up is that there will be one split per line, directly after the acronym for each line. The strip should take away the whitespace. The imported document is at the top of my code. I would like for each of the data entries to be added to the dictionary. The if loop is used because there are multiple values for the same key, so when I input that key later on I would like it to output all of the keys added together in one continuous text separated by commas. Any help is appreciated.
Alternatively, my problem could be with the retrieval of the acronym, that part of the code is here:
lookup_loop = 0
while lookup_loop == 0:
print('Let us lookup some acronyms!')
lookup = input('Enter an acronym: ')
if lookup in acronyms_dict:
print("{} is short for {}".format(lookup, acronyms_dict[lookup]))
#format() formats the specified value(s) and inserts them inside the string's placeholder
else:
new_entry = input('This key does not seem to be in the dictionary\nWould you like to create a value and add the term?\nYes=1\tNo=2\t')
CodePudding user response:
Remove the ' '
from the str.split
. The file is using tabs to delimit the acronyms:
import requests
data_site = requests.get(
"https://raw.githubusercontent.com/priscian/nlp/master/OpenNLP/models/coref/acronyms.txt"
)
text_line = data_site.text.split("\n")
acronyms_dict = {}
for line in text_line:
one_line = line.split(maxsplit=1) # <-- remove the ' '
if len(one_line) > 1:
acro = one_line[0].strip()
meaning = one_line[1].strip()
if acro in acronyms_dict:
acronyms_dict[acro] = acronyms_dict[acro] ", " meaning
else:
acronyms_dict[acro] = meaning
print(acronyms_dict)
Prints:
{
'24KHGE': '24 Karat Heavy Gold Electroplate',
'2B1Q': '2 Binary 1 Quaternary',
'2D': '2-Dimensional',
...