Home > Software design >  Why doesn't my function remove all the things as programmed?
Why doesn't my function remove all the things as programmed?

Time:01-02

So I wrote a function that analyzes a text file and returns the text as a list excluding several characters like ('\n',' ','!','.','@','#')

I tried to program my code and used a sample text file filename which says I love Computer Science sooooooooooooooo much!!!!!

Now I expect my output to look like this...

['I', 'l', 'o', 'v', 'e', 'C', 'o', 'm', 'p', 'u', 't', 'e', 'r', 'S', 'c', 'i', 'e', 'n', 'c', 'e', 's', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'm', 'u', 'c', 'h']

but my output returns

['I', 'l', 'o', 'v', 'e', 'C', 'o', 'm', 'p', 'u', 't', 'e', 'r', 'S', 'c', 'i', 'e', 'n', 'c', 'e', 's', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'm', 'u', 'c', 'h', '!', '!']

but my code is programmed to remove the two '!' at the end....

What should I change in my code???

Here is my code btw...

def reverse(filename):
    s = open(filename, 'r')
    content = s.read()
    g = list(content)
    for x in g:
        if x in ('\n',' ','!','.','@','#'):
            g.remove(x)
    return g

CodePudding user response:

You can't modify a list while you are iterating through it. Well, you can, but it gets the iterator pointers screwed up. The right answer is to create a new list with the things you want to keep. And you don't have to convert the file to a list in order to iterate its contents. Strings are iterables, just like lists.

def reverse(filename):
    s = open(filename, 'r')
    g = []
    for c in s.read():
        if c not in '\n !.@#':
            g.append(c)
    return g

Or:

def reverse(filename):
    s = open(filename, 'r')
    return [c for c in s.read() if c not in '\n !.#@']

CodePudding user response:

You can't iterate the list while removing the item from the same list because the index get changed so the best way is to iterate the list in reverse order

see the solution is :

def reverse(filename):
    s = open(filename, 'r')
    content = s.read()
    g = list(content)
    for x in reversed(g):
        if x in ('\n',' ','!','.','@','#'):
            g.remove(x)
    return g

OUTPUT :

['I','l','o','v','e','C','o','m','p','u','t','e','r','S','c','i','e','n','c','e','s','o','o','o','o','o','o','o','o','o','o','o','o','o','o','o','m', 'u','c','h']

CodePudding user response:

As Tim's answer says, the principled way to handle this is create a new list with the things you want to keep. That said, you might find it interesting that it is technically possible to remove items from the list without messing up the iterator pointer if you traverse the list backwards.

So with that said, here's a modification of your code that doesn't generate a separate list.

def reverse(filename):
    s = open(filename, 'r')
    content = s.read()
    g = list(content)
    for k in range(len(g))[::-1]:
        if g[k] in ('\n',' ','!','.','@','#'):
            g.pop(k)
    return g

Note that unlike the "remove" function, the "pop" function does not require searching through the list when implemented.

Alternatively, as long as you're using the "remove" function, you can use the fact that the remove function searches through the list to avoid searching through the list "from scratch". Consider the following:

def reverse(filename):
    s = open(filename, 'r')
    content = s.read()
    g = list(content)
    
    for c in ('\n',' ','!','.','@','#'):
        while c in g:
            g.remove(c)
    return g

Notably, the c in g check for the while condition consists of an extra search. We can avoid this by handling the exception of the remove function in the case that the character isn't found.

def reverse(filename):
    s = open(filename, 'r')
    content = s.read()
    g = list(content)
    
    for c in ('\n',' ','!','.','@','#'):
        while True:
            try:
                g.remove(c)
            except(ValueError):
                break
    return g

CodePudding user response:

Might be more efficient to remove with str.replace:

def reverse(filename):
    with open(filename) as f:
        s = f.read()
    for c in '\n !.@#':
        s = s.replace(c, '')
    return list(s)

That goes over the string a few times, but such string methods are very fast.

  • Related