Home > Back-end >  Regular Expression to find number before substring
Regular Expression to find number before substring

Time:06-01

I would like to write a regex in oracle to find the number before 'M(CAT)' or 'F(CAT)'. The number could be a whole number or a decimal.

Examples:

1.   5.4M(PIG), 8F(COW), 1F(DOG), 2.9M(CAT) -- answer should be 2.9
2.   2F(PIG), 7.4M(COW), 4.6F(DOG), 3F(CAT) -- answer should be 3
3.   1.5M(CAT) -- answer should be 1.5
4.   4F(PIG), 12F(CAT) -- answer should be 12
5.   7F(COW), 2.3M(DOG) -- answer should be null
6.   7.2F(COW) -- answer should be null

CodePudding user response:

Regexp_substr doesn't support backreferences, so probably the easiest way is to include M/F(CAT) in your substr, and then remove it afterwards, for example:

rtrim(regexp_substr(str, '([0-9] (\.[0-9])?)[MF]\(CAT\)'), 'MF(CAT)')

In this example, rtrim removes any of the characters ()ACFMT from the output.

As mathguy implied, if you have an input str of 8F(CAT), 3M(CAT), this will return 8 (the first match). If you want both of them, probably the best solution is to first split your comma-separated strings into separate rows and then run a regexp on them to extract the number. But you could also use the occurrence param to pull out any second occurrences more manually:

rtrim(regexp_substr(str, '([0-9] (\.[0-9])?)[MF]\(CAT\)', 1, 2), 'MF(CAT)')

A different option for this whole thing is to use backreferences with regexp_replace, but I think it's more awkward, since you have to include the rest of the string to remove it:

regexp_replace(str, '(^|.  )([0-9] (\.[0-9])?)[MF]\(CAT\).*', '\2')
  • Related