Home > Mobile >  Using an String[] in Regex Pattern
Using an String[] in Regex Pattern

Time:12-02

This is my first time dealing with regex. I need to have a string array as apart of a regex pattern. Specifically I’m trying to match a date so the two formats I’m dealing with are DDTTTTMMM and MMMDDTTTT the month is a three letter abbreviation (ex:DEC) I can’t control where the month is placed in my input.

Date example for today is 011150DEC or DEC011150

String[] months = {“JAN”, “FEB”, …, “DEC”}

String pattern1 = [0-9][0-9][0-9][0-9][0-9][0-9][months]; 

String pattern2 = [months][0-9][0-9][0-9][0-9][0-9][0-9];

CodePudding user response:

For example, using simple string concatenation:

string[] months = {"JAN", "FEB", "DEC"};
string monthsGroup = "(?:"   String.Join("|", months)   ")";
string pattern1 = @"\d{6}"   monthsGroup;
string pattern2 = monthsGroup   @"\d{6}";

Console.WriteLine(pattern1);    // \d{6}(?:JAN|FEB|DEC)
Console.WriteLine(pattern2);    // (?:JAN|FEB|DEC)\d{6}

Of course, you could alternatively use String.Format or interpolated strings as you prefer.

CodePudding user response:

Regular expressions are not a good for parsing dates; just imagine leap year problem. You can use DateTime.TryParseExact instead

private static bool TryMyParse(string text, out DateTime result) => 
  DateTime.TryParseExact(
    text,
    new string[] { "ddHHmmMMM", "MMMddHHmm"},
    CultureInfo.InvariantCulture,
    DateTimeStyles.AssumeLocal,
    out result);

Demo:

string[] tests = new string[] {
  "011150DEC",
  "DEC011150",
  "abracadabra",
};

var report = string.Join(Environment.NewLine, tests
  .Select(test => $"{test,20} => {(TryMyParse(test, out var date) ? date.ToString("dd MMMM yyyy HH:mm") : "???")}"));

Console.Write(result);

Output:

           011150DEC => 01 December 2022 11:50
           DEC011150 => 01 December 2022 11:50
         abracadabra => ???

If you insist on regular expression, let's build the pattern step by step:

// Let's use correct abbreviations instead of hardcoded ones
string months = string.Join("|", CultureInfo
  .InvariantCulture
  .DateTimeFormat
  .AbbreviatedMonthNames
  .Where(m => !string.IsNullOrEmpty(m)))
     .ToUpper();

// Subpatterns for days, hours, minutes and months
string dd = "(?<dd>[0-3][0-9])"; // 00 and 39 day are still possible...
string HH = "(?<HH>[0-2][0-9])"; // 27 hour is still possible...
string mm = "(?<mm>[0-5][0-9])";
string MMM = $"(?<MMM>{months})";

// Time to construct the final pattern which should match 
// either ddHHmmMMM or MMMddHHmm:
string pattern = string.Join("|",
  $"(^{dd}{HH}{mm}{MMM}$)",
  $"(^{MMM}{dd}{HH}{mm}$)");

Have a look at the monster (which, however, let days be 00 and 39 allowes hours to be 26):

Console.Write(pattern);

Output:

(^(?<dd>[0-3][0-9])(?<HH>[0-2][0-9])(?<mm>[0-5][0-9])(?<MMM>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)$)|(^(?<MMM>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(?<dd>[0-3][0-9])(?<HH>[0-2][0-9])(?<mm>[0-5][0-9])$)
  • Related