Home > Enterprise >  Regex to check if sequence of characters exists before or after delimiter
Regex to check if sequence of characters exists before or after delimiter

Time:10-14

I have a string such as ID123456_SIT,UAT where ID###### will always be hardcoded.

I need a python regex that will allow me to check whether ID123456_ and (SIT or UAT) exists before (without a comma) or after a comma in a particular string.

Scenarios:

  1. ID123456_SIT,UAT - should match with regex
  2. ID123456_UAT,SIT - should match with regex
  3. ID123456_SIT - should match with regex
  4. ID123456_UAT - should match with regex
  5. ID123456_TRA,SIT,UAT - should match with regex

As of right now the following only works if 1 comma is specified (1 & 2 above), but does not work for single values (3 & 4) if a comma is not specified (bottom 2 scenarios). Also does not work if there was more than 1 comma specified, at which point I should be checking if the word exists between any of the commas (Scenario 5):

  • (^ID123456_)(SIT|UAT),(SIT|UAT) - works for Scenarios 1 & 2 only

Also open to other suggestions for solving the same problem: checking if ID123456 & SIT/UAT is present in a pandas column's values.

Thanks in advance!

CodePudding user response:

You can use

^ID123456_(?=.*(?:SIT|UAT)).*

See the regex demo.

This matches

  • ^ - start of string
  • ID123456_ - text that the string should start with
  • (?=.*(?:SIT|UAT)) - there must be either SIT or UAT after any zero or more chars other than line break chars as many as possible
  • .* - the rest of the line.
  • Related