Home > front end >  Regular Expression for floating point or string
Regular Expression for floating point or string

Time:03-02

I'm trying to find a regular expression that matches a floating point or a string expression.

I.e. a text to match might look like this:

ABC 3.101
DEF 5.0
HIJ ?Error
KLM 1.0
NOP Range

My current version is:

fp_word = r"(?:[- ]?\d .\d |\w \?)"

but its not matching the ?Error or Range case.

It should match

3.101
5.0
?Error (including the question mark)
1.0
Range

CodePudding user response:

You can use

(?<= ). 

See this regex demo. It matches any one or more chars other than line break chars till the end of a line after the first space.

If your regex should only match a number or some word optionally preceded with a ? char and you want to use your regex, but only match at a (non)word boundary you can use

(?:\b(?=\w)|\B(?=\W))(?!^)(?:[- ]?\d (?:\.\d )?|\??\w )

See the regex demo. Here,

  • (?:\b(?=\w)|\B(?=\W)) - an adaptive dynamic word boundary of Type 2 (YouTube video explanation): it matches a word boundary if the next char is a word char, else, the position must be a non-word boundary position
  • (?!^) - not the start of string position
  • (?:[- ]?\d (?:\.\d )?|\??\w ) - either
    • [- ]?\d (?:\.\d )? - an optional or - and then one or more digits followed with an optional sequence of a . and one or more digits
    • | - or
    • \??\w - an optional ? and one or more word chars.

CodePudding user response:

Your regex is this:

(?:[- ]?\d .\d |\w \?)

It is not matching non-numeric strings because you are trying to match 1 word characters followed by a literal ? i.e. ? after the string. Whereas in your input you have just one value that starts with ? and other one doesn't even have a ? so both are failing to match.

If I understand your requirements correctly you can just use this regex:

[ ]([- ]?\d .\d |\S )

RegEx Demo

It starts matching with a space and matched either a signed floating point number or 1 of non-whitespace i.e. \S .

  • Related