Assuming I have the following string:
str = """
HELLO 1 Stop #$**& 5.02‼️ 16.1
regex
5 ,#2.3222
"""
I want to export all numbers , Whether int or float after the word "stop" with no case sensitive . so the expected results will be :
[5.02, 16.1, 5, 2.3222]
The farthest I have come so far is by using PyPi regex from other post here:
regex.compile(r'(?<=stop.*)\d (?:\.\d )?', regex.I)
but this expression gives me only [5.02, 16.1]
CodePudding user response:
You get only the first 2 numbers, as .*
does not match a newline.
You can add update the flags to regex.I | regex.S
to have the dot match a newline.
import regex
text = """
HELLO 1 Stop #$**& 5.02‼️ 16.1
regex
5 ,#2.3222
"""
pattern = regex.compile(r'(?<=\bstop\b.*)\d (?:\.\d )?', regex.I | regex.S)
print(regex.findall(pattern, text))
Output
['5.02', '16.1', '5', '2.3222']
See a Python demo
If you want to print the numbers after the word "stop", you can also use python re
and match stop, and then capture in a group all that follows.
Then you can take that group 1 value, and find all the numbers.
import re
text = """
HELLO 1 Stop #$**& 5.02‼️ 16.1
regex
5 ,#2.3222
"""
pattern = r"\bStop\b(. )"
m = re.search(pattern, text, re.S|re.I)
if m:
print(re.findall(r"\d (?:\.\d )*", m.group(1)))
Output
['5.02', '16.1', '5', '2.3222']
CodePudding user response:
Yet another one, albeit with the newer regex
module:
(?:\G(?!\A)|Stop)\D \K\d (?:\.\d )?
In Python
, this could be
import regex as re
string = """
HELLO 1 Stop #$**& 5.02‼️ 16.1
regex
5 ,#2.3222
"""
pattern = re.compile(r'(?:\G(?!\A)|Stop)\D \K\d (?:\.\d )?')
numbers = pattern.findall(string)
print(numbers)
And would yield
['5.02', '16.1', '5', '2.3222']
Don't name your variables after inbuilt-functions, like str
, list
, dict
and the like.
If you need to go further and limit your search within some bounds (e.g. all numbers between Stop
and end
), you could as well use
(?:\G(?!\A)|Stop)(?:(?!end)\D) \K\d (?:\.\d )?
# ^^^ ^^^
See another demo on regex101.com.
CodePudding user response:
You could use:
inp = """
HELLO 1 Stop #$**& 5.02‼️ 16.1
regex
5 ,#2.3222"""
nums = []
if re.search(r'\bstop\b', inp, flags=re.I):
inp = re.sub(r'^.*?\bstop\b', '', inp, flags=re.S|re.I)
nums = re.findall(r'\d (?:\.\d )?', inp)
print(nums) # ['5.02', '16.1', '5', '2.3222']
The if
logic above ensures that we only attempt to populate the array of numbers if we are certain that Stop
appears in the input text. Otherwise, the default output is just an empty array. If Stop
does appear, then we strip off that leading portion of the string before using re.findall
to find all numbers appearing afterwards.
CodePudding user response:
import re
_string = """
HELLO 1 Stop #$**& 5.02‼️ 16.1
regex
5 ,#2.3222
"""
start = _string.find("Stop") len("Stop")
print(re.findall("[- ]?\d*\.?\d ", _string[start:])) # ['5.02', '16.1', '5', '2.3222']