I am writing some code in Python, trying to clean a string all to lower case without special characters.
string_salada_russa = ' !! LeTRas PeqUEnAS & GraNdeS'
clean_string = string_salada_russa.lower().strip()
print(clean_string)
i = 0
for c in clean_string:
if(c.isalpha() == False and c != " "):
clean_string = clean_string.replace(c, "").strip()
print(clean_string)
for c in clean_string:
if(i >= 1 and i <= len(clean_string)-1):
if(clean_string[i] == " " and clean_string[i-1] == " " and clean_string[i 1] == " "):
clean_string = clean_string.replace(clean_string[i], "Z")
i = 1
print(clean_string)
Expected outcome would be:
#original string
' !! LeTRas PeqUEnAS & GraNdeS'
#expected
'letras pequenas grandes'
#actual outcome
'letraspequenasgrandes'
I am trying to remove the extra spaces, however unsucessfully. I end up removing ALL spaces.
Could anyone help me figure it out? What is wrong in my code?
CodePudding user response:
How about using re
?
import re
s = ' !! LeTRas PeqUEnAS & GraNdeS'
s = re.sub(r"[^a-zA-Z] ", " ", s.lower()).strip()
print(s) # letras pequenas grandes
This first translates the letters into lower case (lower
), replace each run of non-alphabetical characters into a single blank (re.sub
), and then remove blanks around the string (strip
).
Btw, your code does not output 'letraspequenasgrandes'
. Instead, it outputs 'letrasZpequenasZZZZZgrandes'
.
CodePudding user response:
You could get away with a combination of str.lower()
, str.split()
, str.join()
and str.isalpha()
:
def clean(s):
return ' '.join(x for x in s.lower().split(' ') if x.isalpha())
s = ' !! LeTRas PeqUEnAS & GraNdeS'
print(clean(s))
# letras pequenas grandes
Basically, you first convert to lower and the split by ' '
. After that you filter out non-alpha tokens and join them back.
CodePudding user response:
There's no need to strip your string at each iteration of the first for loop; but, other than that, you could keep the first piece of your code:
for c in clean_string:
if(c.isalpha() == False and c != " "):
clean_string = clean_string.replace(c, "")
Then split your string, effectively removing all the spaces, and re-join the word back into a single string, with a single space between each word:
clean_string = " ".join(clean_string.split())