Home > other >  How can I get and replace texts between two strings ​in a string? Asp Core
How can I get and replace texts between two strings ​in a string? Asp Core

Time:02-05

I have an HTML string. I need to change the string to add links to all header tags. For example,

First HTML

<h1> Title 1 </h1>
<p>Lorem ipsum dolor...</p>
<h2> Title 2 </h2>
<h2> Title 2 Different </h2>

The HTML I Want

    <div><a  href="#my_slugged__link_by_title"> <h1> Title 1 </h1> </a> </div>
    <p>Lorem ipsum dolor...</p>
    <div><a  href="#my_slugged__link_by_title_2"> <h2> Title 2 </h2> </a> </div>
     <div><a  href="#my_slugged__link_by_title_2_different"> <h2> Title 2 Different </h2> </a> </div>

**my_slugged__link_by_title** --> I would like to create hash permalinks by Titles. (h1, h2, ...)

For example, the newdescription is my HTML string.

//Replace titles for adding backlinks
                newDescription = "<h2> TEST </h2> <h2> TEST 2</h2>";

This is worked for one string

var oneTitle = Regex.Match(newDescription, @"<h2> (. ?)</h2>").Groups[1].Value

How can I replace and get all of them?

            foreach (var item in Regex.Match(newDescription, @"<h2> (. ?)</h2>").Groups)
            {
                string header_link = "<div class=\"blog_header__backlink_item\"><a href=\"#"   item   "\"><i class=\"fas fa-link\"></i></a></div>";
                newDescription = newDescription.Replace("<h2>", header_link   "<h2>");
            }

CodePudding user response:

You should not try to parse or replace HTML with string methods, even with regex this task is too complicated if the HTML get's complex. Use HtmlAgilityPack (Demo):

string html = @"<h1> Title 1 </h1>
<p>Lorem ipsum dolor...</p>
<h2> Title 2 </h2>
<h2> Title 2 Different </h2>";

string resultHtml = ReplaceHeaderHtml(html);

private static string ReplaceHeaderHtml(string html)
{
    var doc = new HtmlDocument();
    doc.LoadHtml(html);
    var xpath = "//*[self::h1 or self::h2 or self::h3 or self::h4]";
    HtmlNodeCollection headers = doc.DocumentNode.SelectNodes(xpath);
    if (headers == null || headers.Count == 0)
        return html;

    var headerList = headers
        .Where(node => !"a".Equals(node.PreviousSibling?.OriginalName, StringComparison.InvariantCultureIgnoreCase))
        .ToList();

    if (!headerList.Any())
        return html;

    for(int i = 0; i < headerList.Count; i  )
    {
        var header = headerList[i];
        var parentNode = header.ParentNode;
        int headerIndex = parentNode.ChildNodes.IndexOf(header);
        HtmlNode div = doc.CreateElement("div");
        HtmlNode anchor = doc.CreateElement("a");

        string href;
        switch(header.OriginalName)
        {
            case "h1": href = "#my_slugged__link_by_title"; break;
            case "h2": href = "#my_slugged__link_by_title_2"; break;
            case "h3": href = "#my_slugged__link_by_title_3"; break;
            default: href = "#my_slugged__link_by_title"; break;
        }

        anchor.Attributes.Add("class", "header_link");
        anchor.Attributes.Add("href", href);

        div.ChildNodes.Add(anchor);
        div.ChildNodes.Add(header);
        parentNode.ChildNodes.Remove(header);
        parentNode.ChildNodes.Insert(headerIndex, div);
    }

    return doc.DocumentNode.OuterHtml;
}
  •  Tags:  
  • Related