This is my first time dealing with regex. I need to have a string array as apart of a regex pattern. Specifically I’m trying to match a date so the two formats I’m dealing with are DDTTTTMMM
and MMMDDTTTT
the month is a three letter abbreviation (ex:DEC
) I can’t control where the month is placed in my input.
Date example for today is 011150DEC
or DEC011150
String[] months = {“JAN”, “FEB”, …, “DEC”}
String pattern1 = [0-9][0-9][0-9][0-9][0-9][0-9][months];
String pattern2 = [months][0-9][0-9][0-9][0-9][0-9][0-9];
CodePudding user response:
For example, using simple string concatenation:
string[] months = {"JAN", "FEB", "DEC"};
string monthsGroup = "(?:" String.Join("|", months) ")";
string pattern1 = @"\d{6}" monthsGroup;
string pattern2 = monthsGroup @"\d{6}";
Console.WriteLine(pattern1); // \d{6}(?:JAN|FEB|DEC)
Console.WriteLine(pattern2); // (?:JAN|FEB|DEC)\d{6}
Of course, you could alternatively use String.Format
or interpolated strings as you prefer.
CodePudding user response:
Regular expressions are not a good for parsing dates; just imagine leap year problem. You can use DateTime.TryParseExact instead
private static bool TryMyParse(string text, out DateTime result) =>
DateTime.TryParseExact(
text,
new string[] { "ddHHmmMMM", "MMMddHHmm"},
CultureInfo.InvariantCulture,
DateTimeStyles.AssumeLocal,
out result);
Demo:
string[] tests = new string[] {
"011150DEC",
"DEC011150",
"abracadabra",
};
var report = string.Join(Environment.NewLine, tests
.Select(test => $"{test,20} => {(TryMyParse(test, out var date) ? date.ToString("dd MMMM yyyy HH:mm") : "???")}"));
Console.Write(result);
Output:
011150DEC => 01 December 2022 11:50
DEC011150 => 01 December 2022 11:50
abracadabra => ???
If you insist on regular expression, let's build the pattern
step by step:
// Let's use correct abbreviations instead of hardcoded ones
string months = string.Join("|", CultureInfo
.InvariantCulture
.DateTimeFormat
.AbbreviatedMonthNames
.Where(m => !string.IsNullOrEmpty(m)))
.ToUpper();
// Subpatterns for days, hours, minutes and months
string dd = "(?<dd>[0-3][0-9])"; // 00 and 39 day are still possible...
string HH = "(?<HH>[0-2][0-9])"; // 27 hour is still possible...
string mm = "(?<mm>[0-5][0-9])";
string MMM = $"(?<MMM>{months})";
// Time to construct the final pattern which should match
// either ddHHmmMMM or MMMddHHmm:
string pattern = string.Join("|",
$"(^{dd}{HH}{mm}{MMM}$)",
$"(^{MMM}{dd}{HH}{mm}$)");
Have a look at the monster (which, however, let days be 00
and 39
allowes hours to be 26
):
Console.Write(pattern);
Output:
(^(?<dd>[0-3][0-9])(?<HH>[0-2][0-9])(?<mm>[0-5][0-9])(?<MMM>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)$)|(^(?<MMM>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)(?<dd>[0-3][0-9])(?<HH>[0-2][0-9])(?<mm>[0-5][0-9])$)