I'm back with another regular expression question. I've tried a few things and I cant seem to crack this unfortunate issue with some messaging data I have. I need to parse a particular value of of a swift message, this works in 99% of my cases but sometimes someone has entered the terminating character in a field I care about.
Imagine I have a text string of this
some noise :50F: some noise 3/GB some noise :50A:
my expression is looking for the 2 characters that come after 3/ in the field :50F: and is coded as follows;
50F:[^:]*?3\/([A-Z]{2})
I use the [^:] because I only care about those values in the 50F field for example if I had string like this;
some noise :50F: some noise some noise :50A: 3/GB
I wouldn't want to match GB
this works really well - apart from on the very rare occasions where my string itself contains a : before the field ends (it seems there is no restriction on this) so for example;
some noise :50F: some : noise 3/GB some noise :50A:
obviously returns nothing - its only really searching "some" there.
the issue is its not neccesary that :50A: follows this field it could be any one of a number of fields (and I am not even sure of the list) but each field is :[0-9]{2,3}[A-Z]{0,1}: - is there anyway to make the searching for the value stop when it reaches something of that pattern? instead of the one colon I am using currently?
I suspect the solution is some kind of negative lookahead - i've just not managed to get anything to work particularly so far
CodePudding user response:
You can use
50F:(?:(?!:[0-9]{2,3}[A-Z]?:).)*?3\/([A-Z]{2})
See the regex demo.
Details:
50F:
- a literal string(?:(?!:[0-9]{2,3}[A-Z]?:).)*?
- any single char (other than a line break chars), zero or more occurrences but as few as possible, that does not start the following pattern::
two or three digits an optional ASCII uppercase letter and a:
char (for more details, see the tempered greedy token related post)3\/
- a literal3/
string([A-Z]{2})
- Group 1: two uppercase letters.