Home > front end >  Function doesn't remove punctuations
Function doesn't remove punctuations

Time:01-23

I'm new with python and I'm having problems with my function. I need to remove punctuation symbols from the string, so, the simplest thing I could think up is to loop the string and replace the punctuation for an empty character; though the function actually does remove full stops, it doesn't do it with the comma. I tried to debug it and it recognises the comma in the condition, but it doesn't remove it.

The code is this:

string = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."


def remove_punctuation(string):

    punctuation = [",", "."]

    for char in string:
        if char in punctuation:
            raw_string = string.replace(char, "")

    return raw_string

print(remove_punctuation(string))

The thing is the exercise says I can only use replace or del, so I'm a bit restricted with this.

CodePudding user response:

solution number 1:

with replacing:

def remove_punctuation(string):
    punctuation = [",", "."]

    for char in punctuation:
        string = string.replace(char, "")

    return string


string = "the Zen of Python, by Tim Peters. beautiful is better than ugly." \
         " explicit is better than implicit. simple is better than complex."

print(remove_punctuation(string))

basically we do the replacing one per each character in punctuation.

solution number 2:

If you wanna get better performance you can .translate the string:

def remove_punctuation(string):
    punctuation = [",", "."]
    table = str.maketrans(dict.fromkeys(punctuation))
    return string.translate(table)

In translation, each key that has the value of None in the table, will be removed from the string. fromkeys will create a dictionary from an iterable and put None as their values (it's the default value)

CodePudding user response:

I would propose to use regex and its magic to strip away the special characters:

import re

foo = "I'm a dirty string...$@@1##@*((#*@"

clean_foo = re.sub('\W ','', foo )

print(clean_foo) # Does not remove spaces (Outputs Imadirtystring1)

clean_foo2 = re.sub('[^A-Za-z0-9 ] ', '', foo)

print(clean_foo2) # Remove special chars only :D (Outputs IIm a dirty string1)

CodePudding user response:

You keep on replacing the punctuation once, and then replace again on the original instead of the modified string.

Try this:

def remove_punctuation(string):

    punctuation = [",", "."]

    for punc in punctuation:
        string = string.replace(punc, "")

    return string

print(remove_punctuation(string))

CodePudding user response:

You can use import punctuations which is a list with all punctuations. Then add each character in the sentence to raw_string if it isn't in punctuation.

from string import punctuation

sentence = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."


def remove_punctuation(string):
    raw_string = ""
    for char in string:
        if char not in punctuation:
            raw_string  = char

    return raw_string

print(remove_punctuation(sentence))

CodePudding user response:

If you wish to have the shortest and fastest answer, you can just remove everything with a translation table like so:

>>> import string
>>> s = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."
>>> s.translate(str.maketrans("", "", string.punctuation))
'the Zen of Python by Tim Peters beautiful is better than ugly explicit is better than implicit simple is better than complex'

CodePudding user response:

In your code you have 2 loops, the inner loop replaces first the comma using string, and for the dot it is using the same unmodified value of string again.

So the string is 2 times used as the source, and the replacement of the second iteration replacing the dot is only stored in raw_string giving you the result with only the dots removed.

You might use a single call using a character class matching either . or , with re.sub

import re

string = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."

def remove_punctuation(string):
    return re.sub(r"[,.]", "", string)

print(remove_punctuation(string))

Output

the Zen of Python by Tim Peters beautiful is better than ugly explicit is better than implicit simple is better than complex

CodePudding user response:

You redefine raw_string at every iteration from scratch, so you only see the effect of the last iteration. You need to 'accumulate' the modifications:

def remove_punctuation(string):

    raw_string = string 
    
    punctuation = [",", "."]

    for char in string:
        if char in punctuation:
            raw_string = raw_string.replace(char, "")

    return raw_string

You can also do this using 'list' comprehension:

"".join(c for c in string if c not in punctuation)
  •  Tags:  
  • Related