Home > front end >  Split text and numbers
Split text and numbers

Time:02-25

I can have one of these inputs:

sma15
sma15.5
sma-15
sma-15.5

foo-15.5usd  / or maybe Euro or any other text/character connected to a number
foo-15%

I need to output :

sma 15
sma 15.5

foo 15.5 usd
foo 15 %

Is this possible with a simple single regex? I need it to be fast.

CodePudding user response:

You can use re.split for this. The part in the capturing group (...) will be included in the result.

>>> tests = ['sma15', 'sma15.5', 'sma-15', 'sma-15.5', 'foo-15.5usd', 'foo-15%']                                                                  
>>> [re.split(r"-?(\d (?:\.\d )?)", x) for x in tests]                     
[['sma', '15', ''],
 ['sma', '15.5', ''],
 ['sma', '15', ''],
 ['sma', '15.5', ''],
 ['foo', '15.5', 'usd'],
 ['foo', '15', '%']]

CodePudding user response:

The following regular expression recognizes int and float;

r"([ -]?(?:\d*\.\d*|\d )(?:[Ee][ -]?\d )?)"

It includes a capturing group for the whole.

Test:

In [1]: import re

In [2]: numre = re.compile(r"([ -]?(?:\d*\.\d*|\d )(?:[Ee][ -]?\d )?)");

In [3]: re.split(numre, "sma15")
Out[3]: ['sma', '15', '']

In [4]: re.split(numre, "sma15.5")
Out[4]: ['sma', '15.5', '']

In [5]: re.split(numre, "sma-15")
Out[5]: ['sma', '-15', '']

In [6]: re.split(numre, "sma-15.5")
Out[6]: ['sma', '-15.5', '']

In [7]: re.split(numre, "foo-15.5usd")
Out[7]: ['foo', '-15.5', 'usd']

In [8]: re.split(numre, "foo-15.5%")
Out[8]: ['foo', '-15.5', '%']
  • Related