I can have one of these inputs:
sma15
sma15.5
sma-15
sma-15.5
foo-15.5usd / or maybe Euro or any other text/character connected to a number
foo-15%
I need to output :
sma 15
sma 15.5
foo 15.5 usd
foo 15 %
Is this possible with a simple single regex
?
I need it to be fast.
CodePudding user response:
You can use re.split
for this. The part in the capturing group (...)
will be included in the result.
>>> tests = ['sma15', 'sma15.5', 'sma-15', 'sma-15.5', 'foo-15.5usd', 'foo-15%']
>>> [re.split(r"-?(\d (?:\.\d )?)", x) for x in tests]
[['sma', '15', ''],
['sma', '15.5', ''],
['sma', '15', ''],
['sma', '15.5', ''],
['foo', '15.5', 'usd'],
['foo', '15', '%']]
CodePudding user response:
The following regular expression recognizes int
and float
;
r"([ -]?(?:\d*\.\d*|\d )(?:[Ee][ -]?\d )?)"
It includes a capturing group for the whole.
Test:
In [1]: import re
In [2]: numre = re.compile(r"([ -]?(?:\d*\.\d*|\d )(?:[Ee][ -]?\d )?)");
In [3]: re.split(numre, "sma15")
Out[3]: ['sma', '15', '']
In [4]: re.split(numre, "sma15.5")
Out[4]: ['sma', '15.5', '']
In [5]: re.split(numre, "sma-15")
Out[5]: ['sma', '-15', '']
In [6]: re.split(numre, "sma-15.5")
Out[6]: ['sma', '-15.5', '']
In [7]: re.split(numre, "foo-15.5usd")
Out[7]: ['foo', '-15.5', 'usd']
In [8]: re.split(numre, "foo-15.5%")
Out[8]: ['foo', '-15.5', '%']