I am attempting to create a minimal algorithm to exhaustively search for duplicates over a list of strings and remove duplicates using an index to avoid changing cases of words and their meanings.
The caveat is the list has such words Blood, blood, DNA, ACTN4, 34-methyl-O-carboxy, Brain, brain-facing-mouse, BLOOD and so on.
I only want to remove the duplicate 'blood' word, keep the first occurrence with the first letter capitalized, and not modify cases of any other words. Any suggestions on how should I proceed?
Here is my code
def remove_duplicates(list_of_strings):
""" function that takes input of a list of strings,
uses index to iterate over each string lowers each string
and returns a list of strings with no duplicates, does not modify the original strings
an exhaustive search to remove duplicates using index of list and list of string"""
list_of_strings_copy = list_of_strings
try:
for i in range(len(list_of_strings)):
list_of_strings_copy[i] = list_of_strings_copy[i].lower()
word = list_of_strings_copy[i]
for j in range(len(list_of_strings_copy)):
if word == list_of_strings_copy[j]:
list_of_strings.pop(i)
j =1
except Exception as e:
print(e)
return list_of_strings
CodePudding user response:
Make a dictionary, {text.lower():text,...}.
d={}
for item in list_of_strings:
if item.lower() not In d:
d[item.lower()] = item
d.values() should be what you want.
CodePudding user response:
I think something like the following would do what you need:
def remove_duplicates(list_of_strings):
new_list = [] # create empty return list
for string in list_of_strings: # iterate through list of strings
string = string[0].capitalize() string[1:].lower() # ensure first letter is capitalized and rest are low case
if string not in new_list: # check string is not duplicate in retuned list
new_list.append(string) # if string not in list append to returned list
return new_list # return end list
strings = ["Blood", "blood", "DNA", "ACTN4", "34-methyl-O-carboxy", "Brain", "brain-facing-mouse", "BLOOD"]
returned_strings = remove_duplicates(strings)
print(returned_strings)
(For reference this was written in Python 3.10)