Home > Software engineering >  Python - How to Remove Words That Started With Number and Contain Period
Python - How to Remove Words That Started With Number and Contain Period

Time:10-10

What is the best way to remove words in a string that start with numbers and contain periods in Python?

this_string = 'lorum3 ipsum 15.2.3.9.7 bar foo 1. v more text 46 2. here and even more text here v7.8.989'

If I use Regex:

re.sub('[0-9]*\.\w*', '', this_string)

The result will be:

'lorum3 ipsum  bar foo  v more text 46  here and even more text here v'

I'm expecting the word v7.8.989 not to be removed, since it's started with a letter.

It will be great if the removed words aren't adding the unneeded space. My Regex code above still adds space.

CodePudding user response:

You can use this regex to match the strings you want to remove:

(?:^|\s)(?=[0-9] \.[0-9.]*(?:\s|$))[0-9.] 

It matches:

  • (?:^|\s) : beginning of string or whitespace
  • (?=[0-9] \.[0-9.]*(\s|$)) : a lookahead that asserts the next character is a digit; there are only digits and .s between this character and the end of string or the next whitespace; and that there is at least one . in this part
  • [0-9.] : some number of digits and periods

You can then replace any matches with the empty string. In python

this_string = 'lorum3 ipsum 15.2.3.9.7 bar foo 1. v more text 46 2. here and even more text here v7.8.989 and also 1.2.3c as well'
result = re.sub(r'(?:^|\s)(?=[0-9] \.[0-9.]*(?:\s|$))[0-9.] ', '', this_string)

Output:

lorum3 ipsum bar foo v more text 46 here and even more text here v7.8.989 and also 1.2.3c as well

CodePudding user response:

If you don't want to use regex, you can also do it using simple string operations:

res = ''.join(['' if (e.startswith(('0','1','2','3','4','5','6','7','8','9')) and '.' in e) else e ' ' for e in this_string.split()])
  • Related