Home > Mobile >  Removing everything except words, digits and spaces using python regex
Removing everything except words, digits and spaces using python regex

Time:11-13

Using Regex to remove everything except words, digits and spaces.

This is the function I defined:

def remove(text):
   return re.sub(r'[^\w\d\s]', '', text)

Is there anything extra or something missed out

CodePudding user response:

Your approach will work. For example:

 import re

 text = ' !"(/£hello world1!!!!%"& '

 def remove(text):
   return re.sub(r'[^\w\d\s]', '', text)

 print (remove(text))

Your output will be:

 >>> hello world1

See this example here.

CodePudding user response:

\w actually catches all the alphabets ([A-Za-z]), numbers (\d), and underscores _

Regx101 Demo for \w

So, better try this code (with a different Regex)

def remove(text):
   return re.sub(r'[^A-Za-z\d\s] ', '', text)

Tell me if its not working...

  • Related