Home > Mobile >  What is the best way to ensure a regex in OCaml matches the entire input string?
What is the best way to ensure a regex in OCaml matches the entire input string?

Time:04-07

In OCaml, I'm trying to check if a regex matches the entire input string, not just a prefix or a suffix or the potion of the input string before the first newline.

For example, I want to avoid a regex of [0-9] matching against strings like these:

let negative_matches = [
    "  123"; (* leading whitespace *)
    "123  "; (* trailing whitespace *)
    "123\n"; (* trailing newline *)
]

I see that Str.string_match still returns true when trailing characters do not match the pattern:

# List.map (fun s -> Str.string_match (Str.regexp "[0-9] ") s 0) negative_matches;;
- : bool list = [false; true; true]

Adding $ to the pattern helps in the second example, but $ is documented to only "match at the end of the line", so the third example still matches

# List.map (fun s -> Str.string_match (Str.reg  exp "[0-9] $") s 0) negative_matches;;
- : bool list = [false; false; true]

I don't see a true "end of string" matcher (like \z in Java and Ruby) documented, so the best answer I've found is to additionally check the length of the input string against the length of the match using Str.match_end:

# List.map (fun s -> Str.string_match (Str.reg  exp "[0-9] ") s 0 && Str.match_end () = String.length s) negative_matches;;
- : bool list = [false; false; false]

Please tell me I'm missing something obvious and there is an easier way.

CodePudding user response:

You are missing something obvious. There is an easier way. If

[^0-9]

is matched in the input string you will know it contains a non-digit character.

CodePudding user response:

try this for your example

(?<![^A-z]|\w)[0-9] (?![^A-z]|\w)

test it here if you want to generate other patterns you can start by knowing this

(?<!'any group you don't want it to appear before your desire')

(?!'any group you don't want it to appear after your desire')

  • Related