Home > Net >  Question regarding Regex and StreamReader
Question regarding Regex and StreamReader

Time:11-26

I am supposed to:

  1. To check if the file created exists
  2. Read the content inside a premade text file and only extract out the dates
  3. Doing a count of the number of matches

The text file is named: test.txt and contains the following info:

22 Jan,
Hello,
983shs247,
^*(26308,
27 December,
This is a test,
19 June.

The output should be:

Date Match,
Please enter file name: test.txt,
File exists!,
The date found is: 22 Jan,
The date found is: 27 December,
The date found is: 19 June,
The number of match is: 3.

My code only showed till "Date Match, Please enter the file name: test.txt, File exists!, The number of match is :0. My regex does not seem to extract the dates. Please assist

Console.WriteLine("Date Match");
Console.Write("Please enter the file name: ");
string filename = Console.ReadLine();
string fullname = @"C:\Work\"   filename;
int counter = 0;
if (File.Exists(fullname))
{
    Console.WriteLine("File exists!");
    StreamReader file3 = new StreamReader(fullname);
    String inputdata;
    Regex pattern1 = new Regex(@"\d{2}\s [Jan | Feb | Mar | Apr | May | June | Jul | Aug | Sep | Oct | Nov | December]");
     
    while ((inputdata = file3.ReadLine()) != null)
    {
        foreach(Match m in pattern1.Matches(inputdata))
        {
            Console.WriteLine("The  date found is :" inputdata);
            counter  ;
        }
    }
    Console.WriteLine("The number of match is :" counter);
    file3.Close();
}
Console.ReadLine();

CodePudding user response:

Your regex is wrong. Try:

Regex pattern1 = new Regex(@"\d{1,2}\s (?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sept?(?:ember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember))");

Regular Expression Language - Quick Reference

CodePudding user response:

I suggest parsing - DateTime.TryParseExact - instead of regular expression matching, e.g.

  using System.Globalization;
  using System.IO;
  using System.Linq;

  ...

  Console.WriteLine("Date Match");
  Console.Write("Please enter the file name: ");
  
  string filename = Console.ReadLine();
  string fullname = Path.Combine(@"C:\Work", filename);

  // Let's get rid of streams and query the file:
  var counter = File
    .ReadLines(fullname) 
    .Count(line => DateTime.TryParseExact(
       line.Trim(' ', '.', ','), 
       new string[] { "d MMM", "d MMMM"}, 
       CultureInfo.InvariantCulture, 
       DateTimeStyles.AssumeLocal, 
       out var date));

  Console.WriteLine($"The number of match is {counter}"); 

If you can have several dates in a line, you can exploit regular expressions to get crude matches, which you then test by parsing:

  var counter = File
    .ReadLines(fullname)
    .SelectMany(line => Regex
       .Matches(line, @"\b[0-9]{1,2}\s \p{L}{3,}\b")
       .Cast<Match>()
       .Select(match => match.Value)) 
    .Count(line => DateTime.TryParseExact(
       line.Trim(' ', '.', ','), 
       new string[] { "d MMM", "d MMMM"}, 
       CultureInfo.InvariantCulture, 
       DateTimeStyles.AssumeLocal, 
       out var date));

Please, note that:

  • Parsing doesn't match wrong "dates" like 30 Feb
  • It's very easy to add new formats, e.g. MMM d for APR 30
  • If you want to localize the routine (e.g. let it speaks Russian) all you have to do is to specify culture: CultureInfo.GetCulture("ru-Ru") instead of developing a new regular expression pattern
  •  Tags:  
  • c#
  • Related