I want to remove spaces between numbers using python. For example,
my_string = "take take , mouth ? 1 unit 1 2 3 1 mg 0 . 1 mg 1 . 1 mg 1 / 2 take . 5 unit and 00 . 5 unit"
My expected output is,
"take take , mouth ? 1 unit 1231 mg 0.1 mg 1.1 mg 1/2 take 0.5 unit and 0.5 unit"
NOTE: one zero was added before the decimal, and one was removed.
I have tried these,
Regex to remove spaces between numbers only
How to remove multiple spaces between numbers using a single re.sub
Thank you in advance. But these doesn't for single digit numbers separated by spaces.
CodePudding user response:
You do need not only removing spaces between digits but also .
(dot) and /
(slash), this can be accomplished as follows
import re
my_string = "take take , mouth ? 1 unit 1 2 3 1 mg 0 . 1 mg 1 . 1 mg 1 / 2 take . 5 unit and 00 . 5 unit"
no_spaces = re.sub(r'(?<=[0-9./])\s (?=[0-9./])','',my_string)
print(no_spaces)
output
take take , mouth ? 1 unit 1231 mg 0.1 mg 1.1 mg 1/2 take .5 unit and 00.5 unit
\s
denotes any whitespace (i.e. not only space, but also for example \t
), you might elect to change it to
if you are sure you will only encounter spaces. (?<=
...)
and (?=
...)
are positive lookbehind and positive lookahead assertions, these are zero-length assertion and are used to made sure with only match spaces after digit/dot and before digit/dot, without matching said digits/dots. .
when used inside [
and ]
is literal .
and thus does not require escaping.
To add leading zero before .
followed by digit you can use again re.sub
with zero-lenght assertion as follows
leading_zeros = re.sub(r'(?<=\s)(?=\.\d )', '0', no_spaces)
print(leading_zeros)
output
take take , mouth ? 1 unit 1231 mg 0.1 mg 1.1 mg 1/2 take 0.5 unit and 00.5 unit
Explanation: put 0
between whitespace (\s
) and literal dot (\.
) followed by one or more (
) digits (\d
). Observe that .
needs escaping when used outside [
and ]
.
For further discussion of used features read re
docs