Home > Enterprise >  How to remove all punctuations from string except those in decimal numbers in Python?
How to remove all punctuations from string except those in decimal numbers in Python?

Time:11-04

Title question explains it all, but Stackoverflow wants me to write some description. So here we go: Input is the code I'm using, Output is the output I'm getting, required output is the output I want.

Input:

import regex as re  
keyword = 'Auto: tab suspender 2.0 pro'
keyword = re.sub(r'[^\w\s]','', keyword)
words = re.findall('\w ', keyword)
print(keyword)
print(len(words))
words

Output:

Auto tab suspender 20 pro
5
['Auto', 'tab', 'suspender', '20', 'pro']

Required Output:

Auto tab suspender 2.0 pro
5
['Auto', 'tab', 'suspender', '2.0', 'pro']

CodePudding user response:

I would use re.findall here:

keyword = 'Auto: tab suspender 2.0 pro'
matches = re.findall(r'\d (?:\.\d )?|\w ', keyword)
print(matches)  # ['Auto', 'tab', 'suspender', '2.0', 'pro']

The regex pattern used here first attempts to match an integer or float, and that failing will look for words:

  • \d match an integer
  • (?:\.\d )? or maybe a float
  • | OR
  • \w match a word
  • Related