Home > Mobile >  How to make regex that matches all possible episode numbers from a tv show file format?
How to make regex that matches all possible episode numbers from a tv show file format?

Time:03-26

I would like to create a regex expression that matches all possible episode numbering formats from a tv show file format.

I currently have this regex which matches most but not all of the list of examples.

(?:(?<=e)|(?<=episode)|(?<=episode[\.\s]))(\d{1,2})|((?<=-)\d{1,2})

The one it does not match is when there are two episodes directly after another e0102 should match 01 and 02.

You can find the regex example with test cases here

CodePudding user response:

As per your comment, I went by following assumptions:

  • Episode numbers are never more than three digits long;
  • Episode strings will therefor have either 1-3 digits or 4 or 6 when its meant to be a range of episodes;
  • There is never an integer of 5 digits assuming the same padding would be used for both numbers in a range of episodes;
  • This would mean that lenght of either 4 or 6 digits needs to be split evenly.

Therefor, try the following:

e(?:pisode)?\s*(\d{1,3}(?!\d)|\d\d\d??)(?:-?e?(\d{1,3}))?(?!\d)

Here is an online demo. You'll notice I added some more samples to showecase the above assumptions.


  • e(?:pisode)?\s* - Match either 'e' or 'episode' with 0 trailing whitespace characters;
  • (\d{1,3}(?!\d)|\d\d\d??) - A 1st capture group to catch 1-3 digits if not followed by any other digit or two digits;
  • (?:-?e?(\d{1,3}))? - An optional non-capture group with a nested 2nd capture group looking for optional hyphen and literal 'e' with trailing digits (1-3);
  • (?!\d) - There is no trailing digit left.
  • Related