How do I get rid of repeating special characters with regular expressions?-CodePudding

I want to get rid of all the repetitive dots except the ones that were one dot.

Sources:

(1) "a... b."
(2) "a....... b... c."

Results I want:

(1) "a b."
(2) "a b c."

Code:

import re

a = "a... b."
b = "a....... b... c."

result = re.sub("[^a-zA-Z0-9 \\.{1}]", "", a)
print(result)

result = re.sub("[^a-zA-Z0-9 \\.{1}]", "", b)
print(result)

result = re.sub("[^a-zA-Z0-9 ][\\.{2,}]", "", a)
print(result)

result = re.sub("[^a-zA-Z0-9 ][\\.{2,}]", "", b)
print(result)

Doesn't work.

How can I do to get my results?

CodePudding user response：

Below code can do the needed task

import re
result = re.sub("\\.{2,}","","a....b....c.d....e.")
print(result)

Result will be-
abc.de.

CodePudding user response：

This will work:

import re
    
a = "a... b."
b = "a....... b... c."
    
result = re.sub("\\.{2,}","", a)
print(result)
    
result = re.sub("\\.{2,}","", b)
print(result)

CodePudding user response：

You can use

re.sub(r'\.{2,}|[^a-zA-Z0-9.\s]', '', text)

See the regex demo.

Details:

\.{2,} - two or more dots
| - or
[^a-zA-Z0-9.\s] - any char other than an ASCII letter, digit, any whitespace or . chars.