I need to take out shift id from text using regular expressions to provide correct payment. We have 3 types of messages from customer to our system:
1)Payment for shift # edc5df26-ad62-4685-ad80-4a3a60118479 receipt number #12345
2)Payment for shift # 394e3027-be5d-4369-91e6-88437c5330e0, adress: Germany, Frankfurt..
3)Payment for job shift # c921e015-74b2-4df2-84b2-e546a636272f
So the result should be:
1)'edc5df26-ad62-4685-ad80-4a3a60118479'
2)'394e3027-be5d-4369-91e6-88437c5330e0'
3)'c921e015-74b2-4df2-84b2-e546a636272f'
which can end rather with space symbol, comma, or be the end of message.
So I can only takte all symbols after # using:
(?<=#).*
But have no idea what to do next. What regular expression can solve the issue?
CodePudding user response:
Right after you matched the # symbol, you can start capturing your shift ID with this regex for example:
(?<=#)\s([a-z-\d] )
- \s: to match the whitespace character
- (): to capture your id
- [a-z-\d]: to match any lowcase character, hyphen and digit
CodePudding user response:
You could assert shift #
to the left, and then match the range of allowed characters followed by repeating the hyphen at least 1 or more times.
(?<=\bshift # )[a-f0-9] (?:-[a-f0-9] )
See a regex demo.