I have a regex that looks like this:
bool(re.match('^(0|[.]{0,1}[1-9][.]{0,1}[0-9]*)$', "0.123"))
This works perfect, but it doesn't match for leading zeros when there contains a decimal like the above example. My objective is to only match leading zero numbers when they contain a decimal. Strings like: "01" should not be matched. Numbers that are decimal numbers or whole numbers can be matched.
How would I create a regex that matches for this?
Some scenarios:
0.01 Match
1 Match
1.2 Match
.01 Match
-0.1 Match
-1 Match
1.2.3 No Match
01 No Match
001.1 No Match
00.1 No Match
CodePudding user response:
Here's a non-regex approach which looks like passes all the mentioned test cases:
numbers = """\
0.123
0
01
0.01
1
1.2
.01
-0.1
-1
1.2.3
-01.2
01
-01
001.1
00.1\
"""
def is_valid_num(s: str):
unsigned_s = s.lstrip('- ')
if unsigned_s.startswith('0'):
try:
if not unsigned_s[1] == '.':
return False
except IndexError:
# It's just a zero (0)
return True
try:
_ = float(s)
return True
except ValueError:
return False
if __name__ == '__main__':
for n in numbers.split('\n'):
print(f'{n:<10} -> {is_valid_num(n)!r:>10}')
Output:
0.123 -> True
0 -> True
01 -> False
0.01 -> True
1 -> True
1.2 -> True
.01 -> True
-0.1 -> True
-1 -> True
1.2.3 -> False
-01.2 -> False
01 -> False
-01 -> False
001.1 -> False
00.1 -> False
CodePudding user response:
In your example, you can use this regex:
^-?(((?!0)[0-9] |0)(\.[0-9] )?|\.[0-9] )$
I'm not sure about 2.
if that is a number.
If it is, you can use:
^-?(((?!0)[0-9] |0)(\.([0-9] )?)?|\.[0-9] )$
CodePudding user response:
Extend the tests with anything applicable and use or adapt this simple re:
In [1]: tests_map = {True: ['0.01', '1', '1.2', '.01', '01'],
...: False: ['1.2.3']}
In [2]: pattern = r'^\d*\.{0,1}\d*$'
In [3]: import re
In [4]: for result, tests in tests_map.items():
...: for test in tests:
...: if not (bool(re.match(pattern, test)) == result):
...: print(f"{test} was not {result}")
...:
You can 'read' the pattern as "zero or more decimal characters followed by zero or one decimal points followed by zero or more decimal characters".