Home > database >  Using a Regex to find all instances of a specific string within a document, where a few words within
Using a Regex to find all instances of a specific string within a document, where a few words within

Time:05-17

So I know using a regex to accomplish this task is the way to go, but I am beyond abysmal when it comes to using them.

I have a log document, where in the document will be many instances of such a line:

INFO Example.Services.Controllers.SubscribeController - Beginning Log Subscription for EmployeeID: jdoe RecordID: 111222

There is a lot of other stuff in the log document, like stack traces and what not.

I want to extract each instance of the line above out, but the I am having trouble because the values for "EmployeeID" and "RecordID" change.

CodePudding user response:

Assuming that your employee id is only small characters and record id is numbers only, then the regex to match this line would be (INFO Example\.Services\.Controllers\.SubscribeController - Beginning Log Subscription for EmployeeID: [a-z] RecordID: [0-9]). Sample code to match your regex with your test string is

using System;
using System.Text.RegularExpressions;

public class Example
{
    public static void Main()
    {
        string pattern = @"(INFO Example\.Services\.Controllers\.SubscribeController - Beginning Log Subscription for EmployeeID: [a-z] RecordID: [0-9])";
        string input = @"Test String";
        RegexOptions options = RegexOptions.Multiline;
        
        foreach (Match m in Regex.Matches(input, pattern, options))
        {
            Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
        }
    }
}

If there are more rules to your employee id and record id then you can add them in the regex to fine-tune it. Like, if record id is exactly six characters than your regex would be (INFO Example\.Services\.Controllers\.SubscribeController - Beginning Log Subscription for EmployeeID: [a-z] RecordID: [0-9]{6}). If you want to add condition to employeeid for 20 characters than regex would be (INFO Example\.Services\.Controllers\.SubscribeController - Beginning Log Subscription for EmployeeID: [a-z]{1,20} RecordID: [0-9]{6}). So, Employee can be anywhere between 1 and 20 characters. write [a-zA-Z] if it contains capital letters as well and [a-zA-Z0-9] if it contain alphabets and numbers

  • Related