How to find duplicate letters and remove all of them from a string in a list-CodePudding

Given a list of names:

name_list = ['Jasonn', 'pPeter', 'LiSsa', 'Joanna']

I want to remove the same letters(case insensitive), say for name_list[0], it will be 'Jaso' and for name_list[3], it will be 'Jo' since after 'n's are removed, 'a's should also be removed.

Here's my code:

i = 0
for name in name_list:
    ind = name_list.index(name)
    length = len(name)
    for i in range(0,length-1):
        if name[i].lower() == name[i 1].lower():
            name = name_list[ind].replace(name[i], '', 1)
            name = name.replace(name[i], '', 1)
            length -= 2
            if i >= 1 and name[i].lower() == name[i-1].lower():
                name = name_list[ind].replace(name[i], '', 1)
                name = name.replace(name[i-1], '', 1)
        else:
            i  = 1
    if ind != len(name_list): 
        print(sep,end='', sep='') #sep is my separator
print()

My code does not compile. It fails on this line:

if i >= 1 and name[i].lower() == name[i-1].lower():

with:

IndexError: string index out of range

I can't figure out why the range is wrong. My first thought was to check if the index is bigger than 0 so that i-1 would not be negative. For example, given the string 'pPeter', after I removed 'pP', I then just check the new letter 'e' for i = 0 and 't' for i 1 since there's no letter before index 0.

and for 'J[0]o[1]a[2]n[3]n[4]a[5]'

When i = 3, the 'n's for i and i 1 are removed. The string then becomes 'J[0]o[1]a[2]a[3]'.
Since i = 3 > 0 and both i-1 and i equals 'a', we remove the 'a's and generate 'Jo'.

Could someone help me figure out where I went wrong?

CodePudding user response：

This approach looks unnecessarily complex.

Instead, you can keep track of the frequencies of every letter in the list. Then, retain only the letters that appear exactly once:

from collections import Counter

name_list = ['Jasonn', 'pPeter', 'LiSsa', 'Joanna']
result = []

for name in name_list:
    letter_freqs = Counter(name.lower())
    result.append(''.join(letter for letter in name if letter_freqs[letter.lower()] == 1))

print(result)

This outputs:

['Jaso', 'tr', 'Lia', 'Jo']

CodePudding user response：

With regular expressions:

from re import sub, IGNORECASE

name_list = ['Jasonn', 'pPeter', 'LiSsa', 'Joanna']
result = []
for name in name_list:
    name2=name
    while True:
        name2=sub(r'(\w)(\1)', '', name, flags=IGNORECASE)
        if name2 == name:
            result.append(name2)
            break
        else:
            name = name2
print(result)
['Jaso', 'eter', 'Lia', 'Jo']