I'm new with python and I'm having problems with my function. I need to remove punctuation symbols from the string, so, the simplest thing I could think up is to loop the string and replace the punctuation for an empty character; though the function actually does remove full stops, it doesn't do it with the comma. I tried to debug it and it recognises the comma in the condition, but it doesn't remove it.
The code is this:
string = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."
def remove_punctuation(string):
punctuation = [",", "."]
for char in string:
if char in punctuation:
raw_string = string.replace(char, "")
return raw_string
print(remove_punctuation(string))
The thing is the exercise says I can only use replace or del, so I'm a bit restricted with this.
CodePudding user response:
solution number 1:
with replacing:
def remove_punctuation(string):
punctuation = [",", "."]
for char in punctuation:
string = string.replace(char, "")
return string
string = "the Zen of Python, by Tim Peters. beautiful is better than ugly." \
" explicit is better than implicit. simple is better than complex."
print(remove_punctuation(string))
basically we do the replacing one per each character in punctuation
.
solution number 2:
If you wanna get better performance you can .translate
the string:
def remove_punctuation(string):
punctuation = [",", "."]
table = str.maketrans(dict.fromkeys(punctuation))
return string.translate(table)
In translation, each key that has the value of None
in the table, will be removed from the string. fromkeys
will create a dictionary from an iterable and put None
as their values (it's the default value)
CodePudding user response:
I would propose to use regex and its magic to strip away the special characters:
import re
foo = "I'm a dirty string...$@@1##@*((#*@"
clean_foo = re.sub('\W ','', foo )
print(clean_foo) # Does not remove spaces (Outputs Imadirtystring1)
clean_foo2 = re.sub('[^A-Za-z0-9 ] ', '', foo)
print(clean_foo2) # Remove special chars only :D (Outputs IIm a dirty string1)
CodePudding user response:
You keep on replacing the punctuation once, and then replace again on the original instead of the modified string.
Try this:
def remove_punctuation(string):
punctuation = [",", "."]
for punc in punctuation:
string = string.replace(punc, "")
return string
print(remove_punctuation(string))
CodePudding user response:
You can use import punctuations which is a list with all punctuations. Then add each character in the sentence to raw_string
if it isn't in punctuation
.
from string import punctuation
sentence = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."
def remove_punctuation(string):
raw_string = ""
for char in string:
if char not in punctuation:
raw_string = char
return raw_string
print(remove_punctuation(sentence))
CodePudding user response:
If you wish to have the shortest and fastest answer, you can just remove everything with a translation table like so:
>>> import string
>>> s = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."
>>> s.translate(str.maketrans("", "", string.punctuation))
'the Zen of Python by Tim Peters beautiful is better than ugly explicit is better than implicit simple is better than complex'
CodePudding user response:
In your code you have 2 loops, the inner loop replaces first the comma using string, and for the dot it is using the same unmodified value of string
again.
So the string is 2 times used as the source, and the replacement of the second iteration replacing the dot is only stored in raw_string
giving you the result with only the dots removed.
You might use a single call using a character class matching either .
or ,
with re.sub
import re
string = "the Zen of Python, by Tim Peters. beautiful is better than ugly. explicit is better than implicit. simple is better than complex."
def remove_punctuation(string):
return re.sub(r"[,.]", "", string)
print(remove_punctuation(string))
Output
the Zen of Python by Tim Peters beautiful is better than ugly explicit is better than implicit simple is better than complex
CodePudding user response:
You redefine raw_string
at every iteration from scratch, so you only see the effect of the last iteration. You need to 'accumulate' the modifications:
def remove_punctuation(string):
raw_string = string
punctuation = [",", "."]
for char in string:
if char in punctuation:
raw_string = raw_string.replace(char, "")
return raw_string
You can also do this using 'list' comprehension:
"".join(c for c in string if c not in punctuation)