I would like to write a regex in oracle to find the number before 'M(CAT)' or 'F(CAT)'. The number could be a whole number or a decimal.
Examples:
1. 5.4M(PIG), 8F(COW), 1F(DOG), 2.9M(CAT) -- answer should be 2.9
2. 2F(PIG), 7.4M(COW), 4.6F(DOG), 3F(CAT) -- answer should be 3
3. 1.5M(CAT) -- answer should be 1.5
4. 4F(PIG), 12F(CAT) -- answer should be 12
5. 7F(COW), 2.3M(DOG) -- answer should be null
6. 7.2F(COW) -- answer should be null
CodePudding user response:
Regexp_substr doesn't support backreferences, so probably the easiest way is to include M/F(CAT) in your substr, and then remove it afterwards, for example:
rtrim(regexp_substr(str, '([0-9] (\.[0-9])?)[MF]\(CAT\)'), 'MF(CAT)')
In this example, rtrim
removes any of the characters ()ACFMT
from the output.
As mathguy implied, if you have an input str of 8F(CAT), 3M(CAT)
, this will return 8
(the first match). If you want both of them, probably the best solution is to first split your comma-separated strings into separate rows and then run a regexp on them to extract the number. But you could also use the occurrence
param to pull out any second occurrences more manually:
rtrim(regexp_substr(str, '([0-9] (\.[0-9])?)[MF]\(CAT\)', 1, 2), 'MF(CAT)')
A different option for this whole thing is to use backreferences with regexp_replace, but I think it's more awkward, since you have to include the rest of the string to remove it:
regexp_replace(str, '(^|. )([0-9] (\.[0-9])?)[MF]\(CAT\).*', '\2')