Home > Net >  Data Cleaning - Text Cleaner clarification
Data Cleaning - Text Cleaner clarification

Time:05-07

def string_clean(s):
    
    cleaned_string = "".join([i for i in s if not i.isdigit()])
    return cleaned_string

This piece of code works perfectly fine. However, I would like to hear why the for loop in the join method is contained within [] (Square-brackets). Additionally why is the syntax like that and is there a way to structure it according to the basic for-loop-syntax that one usually comes across. E.g:

   for i in s:
      if not i.isdigit():
         return ""  i

I know this example is sketchy at best, parallel to my coding prowess. But, I would really appreciate feedback if you have the time.

Thanks in advance.

CodePudding user response:

Your example returns the first character that isn't a digit. The foor loop should look like that:

newString = ""
for i in s:
    if not i.isDigit():
        newString  = i
return newString

CodePudding user response:

The for loop in the join method is because it's a list comprehension, which generates a list. This is needed because str.join() expects an iterable, such as a list.

Yes you can structure it as a vanilla for-loop, but list comprehensions are generally considered cleaner, more Pythonic and recommended by Guido van Rossum himself, the creator of Python. You can use a vanilla for loop to replicate the result as below, though one would not generally recommend it:

def clean_string(s):
    cleaned_string = ''
    for i in s:
        if not i.isdigit():
            cleaned_string  = str(i)
    return cleaned_string
  • Related