How to remove the first occurence of a repeated Character with python-CodePudding

I have been given the following string 'abcdea' and I need to find the repeated character but remove the first one so the result most be 'bcdea' I have tried to following but only get this result

def remove_rep(x):
    new_list = []
    for i in x:
        if i not in new_list:
            new_list.append(i)
    new_list = ''.join(new_list)
    print(new_list)

remove_rep('abcdea')

and the result is 'abcde' not the one that I was looking 'bcdea'

CodePudding user response：

You could make use of str.find(), which returns the first occurrence with the string:

def remove_rep(oldString):
    newString = ''
    for i in oldString:
        if i in newString:
            # Character used previously, .find() returns the first position within string
            first_position_index = newString.find(i)
            newString = newString[:first_position_index]   newString[
                first_position_index   1:]
        newString  = i

    print(newString)


remove_rep('abcdea')
remove_rep('abcdeaabcdea')

Out:

bcdea
bcdea

CodePudding user response：

Change

new_list = ''.join(new_list)

new_list = ''.join(new_list[1:] [i])

(and figure out why! Hint: what's the condition of your if block? What are you checking for and why?)

CodePudding user response：

One approach can be to iterate in reverse order over the string, and keep track of all the characters seen in the string. If a character is repeated, we don't add it to the new_list.

def remove_rep(x: str):
    new_list = []
    seen = set()

    for char in reversed(x):
        if char not in seen:
            new_list.append(char)
        seen.add(char)

    return ''.join(reversed(new_list))

print(remove_rep('abcdea'))

Result: 'bcdea'

Note that the above solution doesn't exactly work as desired, as it'll remove all occurrences of a character except the last one; for example, if you have 2 occurrences of a chracter and you only want to remove the first one. To resolve that, you can instead do something like below:

def remove_rep(x: str):
    new_list = []
    first_seen = set()

    for char in x:
        freq = x.count(char)
        if char in first_seen or freq == 1:
            new_list.append(char)
        elif freq > 1:
            first_seen.add(char)

    return ''.join(new_list)

Now for the given input:

print(remove_rep('abcdeaca'))

We get the desired result - only the first a and c is removed:

bdeaca

Test for a more complicated input:

print(remove_rep('abcdeaabcdea'))

We do get the correct result:

aabcdea

Do you see what happened in that last one? The first abcde sequence got removed, as all characters are repeated in this string. So our result is actually correct, even though it doesn't look so at an initial glance.

CodePudding user response：

One of the approaches with one small change in the if condition:

def remove_rep(x):
    new_list = []
    visited = []
    for i, item in enumerate(x):
        if item not in x[i 1:] or item in visited:
            new_list.append(item)
        else:
            visited.append(item)
    new_list = ''.join(new_list)
    print(new_list)

remove_rep('abcdeaa')

Output:

bcdea