I have a function that gets the content inside 2 tags of a string:
string content = string.Empty;
foreach (Match match in Regex.Matches(stringSource, "<tag1>(.*?)</tag1>"))
{
content = match.Groups[1].Value;
}
I need to do this operation many times with different tags. I want to update method so I can pass in the opening closing tags, but I can't concatenate the parameters of my tags with the regular expression. When I pass these values to the new function, the expression does not work:
public string GetContent(string stringSource, string openTag, string closeTag)
{
string content = string.Empty;
foreach (Match match in Regex.Matches(stringSource, $"{openTag}(.*?){closeTag}"))
{
content = match.Groups[1].Value;
}
return content;
}
I want to use the function like this:
string content = GetContent(sourceString, "<tag1>", "</tag1>");
How can I make this work?
CodePudding user response:
Try this:
public IEnumerable<string> GetContent(string stringSource, string tag)
{
foreach (Match match in Regex.Matches(stringSource, $"<{tag}>(.*?)</{tag}>"))
{
yield return match.Groups[1].Value;
}
}
// ...
var content = GetContent(sourceString, "tag1");
Note I also changed the return type. What you had before was the equivalent of calling this function like this: string content = GetContent(sourceString, "tag").LastOrDefault();
Also, Regex is generally a poor choice for handling HTML and XML. There are all kind of edge cases around this, such that RegEx really doesn't work that well.
You can make it seem to work if you can constrain your input to a subset of the language to limit edge cases, and that might get you by for a while, but usually eventually someone will want to use more of the features of the markup language and you'll start getting weird bugs and errors. You'll really do much better with a dedicated, purpose-built parser!