I have a text file of code from an old 3rd party system that I'm trying to upgrade. The code is structured text and looks very similar to VB. I'd like to parse the text file and display formatted text in a WPF application. Ideally, it would look something similar to the Visual Studio code editor.
Below is a sample of the code I am trying to format
--this is a comment
LOCAL tag1 --LOCAL would be formatted
LOCAL tag2
LOCAL foo
IF tag1 > tag2 THEN --IF and THEN would be formatted
foo = tag1
END IF --end if would be formatted
I've managed to do this by creating a FlowDocument from the original text of code. Then I search the text file for keywords and change the color of text with the following method
private FlowDocument FormatDocument(FlowDocument flowDocument, List<string> keyWordList, Brush brush)
{
TextPointer position = flowDocument.ContentStart;
while (position != null)
{
if (position.CompareTo(flowDocument.ContentEnd) == 0)
break;
if (position.GetPointerContext(LogicalDirection.Forward) == TextPointerContext.Text) //checks to see if textpointer is actually text
{
foreach (string keyword in keyWordList)
{
string textRun = position.GetTextInRun(LogicalDirection.Forward);
string pattern = @"\b" Regex.Escape(keyword) @"\b";
Match match = Regex.Match(textRun, pattern, RegexOptions.IgnoreCase);
if (match.Success)
{
int indexInRun = match.Index;
int indexOfComment = textRun.IndexOf("--");
TextPointer startPosition = position.GetPositionAtOffset(indexInRun);
TextPointer endPosition = startPosition.GetPositionAtOffset(keyword.Length);
TextRange keywordRange = new TextRange(startPosition, endPosition);
string test = keywordRange.Text;
if (indexOfComment == -1 || indexInRun < indexOfComment)
keywordRange.ApplyPropertyValue(TextElement.ForegroundProperty, brush);
}
}
position = position.GetNextContextPosition(LogicalDirection.Forward);
}
else //If the current position doesn't represent a text context position, go to the next context position.
position = position.GetNextContextPosition(LogicalDirection.Forward); // This can effectively ignore the formatting or embed element symbols.
}
return flowDocument;
}
The code is a bit slow when the files are large so I'm wondering is there a better way to go about this?
CodePudding user response:
Your code seems okay, except that you're creating a bunch of objects every iteration of each loop, which will be slow, especially for Regex objects. Regex are also much faster if you compile them. Create your Regex objects outside of either loop and compile them, and I'll bet you see some improvement.
If that's not enough improvement, try building a single Regex that will match any word in the keyword list (\b[keyword1|keyword2|keyword3|...]\b).
public static FlowDocument FormatDocument(FlowDocument flowDocument,
List<string> keyWordList,
Brush brush)
{
var regexForKeyword = keyWordList.ToDictionary(k => k,
k => new Regex(@"\b" Regex.Escape(keyword) @"\b",
RegexOptions.Compiled | RegexOptions.IgnoreCase));
var position = flowDocument.ContentStart;
while (position != null)
{
if (position.CompareTo(flowDocument.ContentEnd) == 0)
break;
if (position.GetPointerContext(LogicalDirection.Forward) == TextPointerContext.Text) //checks to see if textpointer is actually text
{
foreach (string keyword in keyWordList)
{
var textRun = position.GetTextInRun(LogicalDirection.Forward);
var match = regexForKeyword[keyword].Match(textRun);
if (match.Success)
{
var indexInRun = match.Index;
var indexOfComment = textRun.IndexOf("--");
var startPosition = position.GetPositionAtOffset(indexInRun);
var endPosition = startPosition.GetPositionAtOffset(keyword.Length);
var keywordRange = new TextRange(startPosition, endPosition);
var test = keywordRange.Text;
if (indexOfComment == -1 || indexInRun < indexOfComment)
keywordRange.ApplyPropertyValue(TextElement.ForegroundProperty, brush);
}
}
position = position.GetNextContextPosition(LogicalDirection.Forward);
}
else //If the current position doesn't represent a text context position, go to the next context position.
position = position.GetNextContextPosition(LogicalDirection.Forward); // This can effectively ignore the formatting or embed element symbols.
}
return flowDocument;
}