I am supposed to:
- To check if the file created exists
- Read the content inside a premade text file and only extract out the dates
- Doing a count of the number of matches
The text file is named: test.txt and contains the following info:
22 Jan,
Hello,
983shs247,
^*(26308,
27 December,
This is a test,
19 June.
The output should be:
Date Match,
Please enter file name: test.txt,
File exists!,
The date found is: 22 Jan,
The date found is: 27 December,
The date found is: 19 June,
The number of match is: 3.
My code only showed till "Date Match, Please enter the file name: test.txt, File exists!, The number of match is :0. My regex does not seem to extract the dates. Please assist
Console.WriteLine("Date Match");
Console.Write("Please enter the file name: ");
string filename = Console.ReadLine();
string fullname = @"C:\Work\" filename;
int counter = 0;
if (File.Exists(fullname))
{
Console.WriteLine("File exists!");
StreamReader file3 = new StreamReader(fullname);
String inputdata;
Regex pattern1 = new Regex(@"\d{2}\s [Jan | Feb | Mar | Apr | May | June | Jul | Aug | Sep | Oct | Nov | December]");
while ((inputdata = file3.ReadLine()) != null)
{
foreach(Match m in pattern1.Matches(inputdata))
{
Console.WriteLine("The date found is :" inputdata);
counter ;
}
}
Console.WriteLine("The number of match is :" counter);
file3.Close();
}
Console.ReadLine();
CodePudding user response:
Your regex is wrong. Try:
Regex pattern1 = new Regex(@"\d{1,2}\s (?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sept?(?:ember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember))");
Regular Expression Language - Quick Reference
CodePudding user response:
I suggest parsing - DateTime.TryParseExact
- instead of regular expression matching, e.g.
using System.Globalization;
using System.IO;
using System.Linq;
...
Console.WriteLine("Date Match");
Console.Write("Please enter the file name: ");
string filename = Console.ReadLine();
string fullname = Path.Combine(@"C:\Work", filename);
// Let's get rid of streams and query the file:
var counter = File
.ReadLines(fullname)
.Count(line => DateTime.TryParseExact(
line.Trim(' ', '.', ','),
new string[] { "d MMM", "d MMMM"},
CultureInfo.InvariantCulture,
DateTimeStyles.AssumeLocal,
out var date));
Console.WriteLine($"The number of match is {counter}");
If you can have several dates in a line, you can exploit regular expressions to get crude matches, which you then test by parsing:
var counter = File
.ReadLines(fullname)
.SelectMany(line => Regex
.Matches(line, @"\b[0-9]{1,2}\s \p{L}{3,}\b")
.Cast<Match>()
.Select(match => match.Value))
.Count(line => DateTime.TryParseExact(
line.Trim(' ', '.', ','),
new string[] { "d MMM", "d MMMM"},
CultureInfo.InvariantCulture,
DateTimeStyles.AssumeLocal,
out var date));
Please, note that:
- Parsing doesn't match wrong "dates" like
30 Feb
- It's very easy to add new formats, e.g.
MMM d
forAPR 30
- If you want to localize the routine (e.g. let it speaks Russian) all you have to do is to specify culture:
CultureInfo.GetCulture("ru-Ru")
instead of developing a new regular expression pattern