I have a string such as ID123456_SIT,UAT
where ID######
will always be hardcoded.
I need a python regex that will allow me to check whether ID123456_
and (SIT
or UAT
) exists before (without a comma) or after a comma in a particular string.
Scenarios:
ID123456_SIT,UAT
- should match with regexID123456_UAT,SIT
- should match with regexID123456_SIT
- should match with regexID123456_UAT
- should match with regexID123456_TRA,SIT,UAT
- should match with regex
As of right now the following only works if 1 comma is specified (1 & 2 above), but does not work for single values (3 & 4) if a comma is not specified (bottom 2 scenarios). Also does not work if there was more than 1 comma specified, at which point I should be checking if the word exists between any of the commas (Scenario 5):
(^ID123456_)(SIT|UAT),(SIT|UAT)
- works for Scenarios 1 & 2 only
Also open to other suggestions for solving the same problem: checking if ID123456 & SIT/UAT is present in a pandas column's values.
Thanks in advance!
CodePudding user response:
You can use
^ID123456_(?=.*(?:SIT|UAT)).*
See the regex demo.
This matches
^
- start of stringID123456_
- text that the string should start with(?=.*(?:SIT|UAT))
- there must be eitherSIT
orUAT
after any zero or more chars other than line break chars as many as possible.*
- the rest of the line.