I want to remove the duplicate adjacent of specific string from list. Suppose that I have a list as below:
list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']
Here is what I have tried so far:
for item in list_ex:
if item.startswith('<word>'):
if item in new_list_ex and (item == list_ex[list_ex.index(item) 1]):
continue
new_list_ex.append(item)
My output of new_list_ex:
['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', 'again', '.']
Desired output:
['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']
I feel like my list_ex[list_ex.index(item) 1] to detect the adjacent element did not work properly. How can I adjust to get the desired output?
Please note that order in this list is important.
CodePudding user response:
Test whether a word flagged with <word>
is the last item in the new_list (new_list_ex[-1]
); if so, continue
(skip it).
If not, just append the word to the new_list.
list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']
new_list_ex = []
for item in list_ex:
if item.startswith('<word>') and (item == new_list_ex[-1]):
continue
new_list_ex.append(item)
CodePudding user response:
i think you mean this work :
list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']
new_list = []
for i in list_ex:
if "<" in i and ">" in i:
if i.split(">")[1].split("<")[0] in new_list:
continue
else:
new_list.append(i.split(">")[1].split("<")[0])
else:
if i in new_list:
continue
else:
new_list.append(i)
print(new_list)