Home > Enterprise >  C# Regular Expression - Does not seem to work consistently
C# Regular Expression - Does not seem to work consistently

Time:10-21

The Microsoft .NET C# regular expression:

@"^(t4_(?:[a-zA-Z]{5}[0-9])_(?:2?[0-3]{2}|1?[0-9])[jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec]{3}(?:[0-9]{4}))_([0-9]{4})_([0-9]{4})_([comp|disk|scs]){4}\\b";

The (filenames that I wish to search for are consistently named as follows:

TraceLog_somename_other.log          -- should not match
t4_systx2_03oct2021_0001_2359_comp   -- Match
t4_systx2_03oct2021_0001_2359_disk   -- Match
t4_systx2_03oct2021_0001_2359_scs    -- does not match

The goal is to scan a directory via a regular express "mask" of files of a specific filename signature. The signature used is the same for all files. It seems to attempt to match to a pattern is one way to do this.

Why would this not work? If there is a better way to accomplish this... thanks for sharing?

Thanks!

Tim P

CodePudding user response:

I assume you are asking why this does not match:

t4_systx2_03oct2021_0001_2359_scs

The problem is with this part of your regex:

([comp|disk|scs]){4}

This is looking for exactly four occurrences of any characters in the set comp|disk|scs. This might match the ending comp, but it could just as easily match the ending cccc, ksid, scss, etc.

Try this instead:

(comp|disk|scs)

As @41686d6564 points out in the comments, you have the same problem with the pattern you use to match months. Instead of [jan|feb|...]{3}, use (jan|feb|...).

Full regex:

^(t4_(?:[a-zA-Z]{5}[0-9])_(?:2?[0-3]{2}|1?[0-9])(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)(?:[0-9]{4}))_([0-9]{4})_([0-9]{4})_(comp|disk|scs)\b
  • Related