I'm trying to find a regular expression that matches a floating point or a string expression.
I.e. a text to match might look like this:
ABC 3.101
DEF 5.0
HIJ ?Error
KLM 1.0
NOP Range
My current version is:
fp_word = r"(?:[- ]?\d .\d |\w \?)"
but its not matching the ?Error
or Range
case.
It should match
3.101
5.0
?Error (including the question mark)
1.0
Range
CodePudding user response:
You can use
(?<= ).
See this regex demo. It matches any one or more chars other than line break chars till the end of a line after the first space.
If your regex should only match a number or some word optionally preceded with a ?
char and you want to use your regex, but only match at a (non)word boundary you can use
(?:\b(?=\w)|\B(?=\W))(?!^)(?:[- ]?\d (?:\.\d )?|\??\w )
See the regex demo. Here,
(?:\b(?=\w)|\B(?=\W))
- an adaptive dynamic word boundary of Type 2 (YouTube video explanation): it matches a word boundary if the next char is a word char, else, the position must be a non-word boundary position(?!^)
- not the start of string position(?:[- ]?\d (?:\.\d )?|\??\w )
- either[- ]?\d (?:\.\d )?
- an optional-
and then one or more digits followed with an optional sequence of a.
and one or more digits|
- or\??\w
- an optional?
and one or more word chars.
CodePudding user response:
Your regex is this:
(?:[- ]?\d .\d |\w \?)
It is not matching non-numeric strings because you are trying to match 1 word characters followed by a literal ?
i.e. ?
after the string. Whereas in your input you have just one value that starts with ?
and other one doesn't even have a ?
so both are failing to match.
If I understand your requirements correctly you can just use this regex:
[ ]([- ]?\d .\d |\S )
It starts matching with a space and matched either a signed floating point number or 1 of non-whitespace i.e. \S
.