Home > OS >  How to find duplicate letters and remove all of them from a string in a list
How to find duplicate letters and remove all of them from a string in a list

Time:06-20

Given a list of names:

name_list = ['Jasonn', 'pPeter', 'LiSsa', 'Joanna'] 

I want to remove the same letters(case insensitive), say for name_list[0], it will be 'Jaso' and for name_list[3], it will be 'Jo' since after 'n's are removed, 'a's should also be removed.

Here's my code:

i = 0
for name in name_list:
    ind = name_list.index(name)
    length = len(name)
    for i in range(0,length-1):
        if name[i].lower() == name[i 1].lower():
            name = name_list[ind].replace(name[i], '', 1)
            name = name.replace(name[i], '', 1)
            length -= 2
            if i >= 1 and name[i].lower() == name[i-1].lower():
                name = name_list[ind].replace(name[i], '', 1)
                name = name.replace(name[i-1], '', 1)
        else:
            i  = 1
    if ind != len(name_list): 
        print(sep,end='', sep='') #sep is my separator
print()

My code does not compile. It fails on this line:

if i >= 1 and name[i].lower() == name[i-1].lower():

with:

IndexError: string index out of range

I can't figure out why the range is wrong. My first thought was to check if the index is bigger than 0 so that i-1 would not be negative. For example, given the string 'pPeter', after I removed 'pP', I then just check the new letter 'e' for i = 0 and 't' for i 1 since there's no letter before index 0.

and for 'J[0]o[1]a[2]n[3]n[4]a[5]'

  1. When i = 3, the 'n's for i and i 1 are removed. The string then becomes 'J[0]o[1]a[2]a[3]'.
  2. Since i = 3 > 0 and both i-1 and i equals 'a', we remove the 'a's and generate 'Jo'.

Could someone help me figure out where I went wrong?

CodePudding user response:

This approach looks unnecessarily complex.

Instead, you can keep track of the frequencies of every letter in the list. Then, retain only the letters that appear exactly once:

from collections import Counter

name_list = ['Jasonn', 'pPeter', 'LiSsa', 'Joanna']
result = []

for name in name_list:
    letter_freqs = Counter(name.lower())
    result.append(''.join(letter for letter in name if letter_freqs[letter.lower()] == 1))

print(result)

This outputs:

['Jaso', 'tr', 'Lia', 'Jo']

CodePudding user response:

With regular expressions:

from re import sub, IGNORECASE

name_list = ['Jasonn', 'pPeter', 'LiSsa', 'Joanna']
result = []
for name in name_list:
    name2=name
    while True:
        name2=sub(r'(\w)(\1)', '', name, flags=IGNORECASE)
        if name2 == name:
            result.append(name2)
            break
        else:
            name = name2
print(result)
['Jaso', 'eter', 'Lia', 'Jo']
  • Related