Home > Software design >  How to change the siblings of xml into parent and child nodes according to their attributes using xs
How to change the siblings of xml into parent and child nodes according to their attributes using xs

Time:10-25

Hi I have xml like this

<?xml version="1.0" encoding="UTF-8"?>
<doc>
   <text height="15840" orient="" width="12240">
      <p style="pxpub">
         <r />
      </p>
      <p id-rel="0" level="0" style="TRH2" type="itemlist">
         <r>Chapter 2. Criminal Defense Malpractice</r>
      </p>
      <p style="TRRef">
         <r style="pxvis">Research References</r>
      </p>
      <p style="TRRefText">
         <r />
         <r bold="false" style="pxvis" />
         <r style="pxtk">West's Key Number Digest, Attorney and Client &amp;key;32 to 45</r>
      </p>
      <p style="TRRefText">
         <r />
         <r bold="false" style="pxvis" />
         <r style="pxtk">West's Key Number Digest, Constitutional Law &amp;key;3799, 3817</r>
      </p>
      <p style="TRRefText">
         <r />
         <r bold="false" style="pxvis" />
         <r style="pxrc">Hollander and Bergman, Everytrial Criminal Defense Resource Book &amp;s;</r>
         <r bold="false" style="pxvis">§</r>
         <r style="pxrc">69:1</r>
      </p>
      <p style="TRRefText">
         <r />
         <r bold="false" style="pxvis" />
         <r style="pxrc">Burkoff and Burkoff, Ineffective Assistance of Counsel &amp;ss;</r>
         <r bold="false" style="pxvis">§§</r>
         <r style="pxrc">1:1 to 1:10</r>
      </p>
      <p style="TRRefText">
         <r />
         <r bold="false" style="pxvis" />
         <r style="pxrc">Hall, Professional Responsibility in Criminal Defense Practice (3d ed.) &amp;ss;</r>
         <r bold="false" style="pxvis">§§</r>
         <r style="pxrc">1:18, 31:2 to 31:26</r>
      </p>
      <p id-rel="0" level="0" style="TRH7" type="itemlist">
         <r>2:1. Legal malpractice generally</r>
      </p>
      <p>
         <r />
         <r bold="false" style="pxvis" />
         <r>Attorneys owe their clients duties relating to professional knowledge.</r>
      </p>
      <p>
         <r />
         <r bold="false" style="pxvis" />
         <r>While, historically, regular reports of legal malpractice."</r>
         <foot-note id-rel="1" style="FootnoteReference" />
      </p>
      <p id-rel="0" level="0" margin-left="142" style="TRH7" type="itemlist">
         <r>2:2. Legal malpractice generally</r>
         <r style="pxsep">--</r>
         <r>Nature of cause of action</r>
         <foot-note id-rel="2" style="FootnoteReference" />
      </p>
      <p>
         <r />
         <r bold="false" style="pxvis" />
         <r>Legal malpractice actions are generally brought in tort and/or contract,.</r>
      </p>
      <p>
         <r />
         <r bold="false" style="pxvis" />
         <r>Under either a tort or a contract theory, the client.</r>
         <foot-note id-rel="3" style="FootnoteReference" />
      </p>
   </text>
</doc>
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

The desired output I need was

<?xml version="1.0" encoding="UTF-8"?>
<chapter>
   <title>Chapter 2. Criminal Defense Malpractice</title>
   <research>
      <title>Research References</title>
      <reftext>West's Key Number Digest, Attorney and Client &amp;key;32 to 45
                        West's Key Number Digest, Constitutional Law &amp;key;3799, 3817
                        Hollander and Bergman, Everytrial Criminal Defense Resource Book &amp;s;§69:1
                        Burkoff and Burkoff, Ineffective Assistance of Counsel &amp;ss;§§1:1 to 1:10
                        Hall, Professional Responsibility in Criminal Defense Practice (3d ed.) &amp;ss;§§1:18, 31:2 to 31:26</reftext>
   </research>
   <sections>
      <title>2:1. Legal malpractice generally</title>
      <paragraphs>
         <para-text>Attorneys owe their clients duties relating to professional knowledge.</para-text>
         <para-text>
            While, historically, regular reports of legal malpractice."
            <footnoteref>
               <id>1</id>
            </footnoteref>
         </para-text>
      </paragraphs>
      <title>
         2:2. Legal malpractice generally-- Nature of cause of action
         <footnoteref>
            <id>2</id>
         </footnoteref>
      </title>
      <paragraphs>
         <para-text>Legal malpractice actions are generally brought in tort and/or contract,.</para-text>
         <para-text>
            Under either a tort or a contract theory, the client.
            <footnoteref>
               <id>3</id>
            </footnoteref>
         </para-text>
      </paragraphs>
   </sections>
</chapter>
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

` The XSL I designed was

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">

  <xsl:output indent="yes" />

  <xsl:strip-space elements="*" />

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:copy-of select="@*" />
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="doc/text">
    <chapter>
      <xsl:apply-templates select="p[@style='TRH2']" />
      <research>
        <xsl:apply-templates select="p[@style='TRRef']" />
        <reftext>
          <xsl:for-each select="p[@style='TRRefText']">
            <xsl:apply-templates/>
            <xsl:text>
            </xsl:text>
          </xsl:for-each>
        </reftext>
      </research>
      <sections>
        <xsl:apply-templates select="p[@style='TRH7']" />
        <paragraphs>
          <xsl:for-each select="p[preceding-sibling::p[@style='TRH7']][normalize-space()]">
            <para-text>
              <xsl:apply-templates/>
            </para-text>
          </xsl:for-each>
        </paragraphs>
      </sections>
    </chapter>
  </xsl:template>

  <xsl:template match="text|r|doc">
    <xsl:apply-templates/>
  </xsl:template>


  <xsl:template match="foot-note">
    <footnoteref>
      <xsl:element name="id">
        <xsl:value-of select="@id-rel" />
      </xsl:element>
      <xsl:apply-templates/>

    </footnoteref>
  </xsl:template>

  <xsl:template match="p[not(normalize-space())]" />

  <xsl:template match="p[@style=('TRH2','TRH7','TRRef')]">
    <title>
      <xsl:apply-templates/>
    </title>
  </xsl:template>


</xsl:stylesheet>
<iframe name="sif3" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

The output I got when I used the above XSL

<?xml version="1.0" encoding="UTF-8"?>
<chapter>
  <title>Chapter 2. Criminal Defense Malpractice</title>
  <research>
    <title>Research References</title>
    <reftext>West's Key Number Digest, Attorney and Client &amp;key;32 to 45 West's Key Number Digest, Constitutional Law &amp;key;3799, 3817 Hollander and Bergman, Everytrial Criminal Defense Resource Book &amp;s;§69:1 Burkoff and Burkoff, Ineffective Assistance
      of Counsel &amp;ss;§§1:1 to 1:10 Hall, Professional Responsibility in Criminal Defense Practice (3d ed.) &amp;ss;§§1:18, 31:2 to 31:26
    </reftext>
  </research>
  <sections>
    <title>2:1. Legal malpractice generally</title>
    <title>2:2. Legal malpractice generally-- Nature of cause of action
      <footnoteref>
        <id>2</id>
      </footnoteref>
    </title>
    <paragraphs>
      <para-text>Attorneys owe their clients duties relating to professional knowledge.</para-text>
      <para-text>While, historically, regular reports of legal malpractice."
        <footnoteref>
          <id>1</id>
        </footnoteref>
      </para-text>
      <para-text>2:2. Legal malpractice generally-- Nature of cause of action
        <footnoteref>
          <id>2</id>
        </footnoteref>
      </para-text>
      <para-text>Legal malpractice actions are generally brought in tort and/or contract,.</para-text>
      <para-text>Under either a tort or a contract theory, the client.
        <footnoteref>
          <id>3</id>
        </footnoteref>
      </para-text>
    </paragraphs>
  </sections>
</chapter>
<iframe name="sif4" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Hi here I'm trying to create a parent child relation between the siblings on the basis of their style attribute. But I'm not able to create the relationship Can anyone help me what changes I need to do to my XSL for getting the desired output.

CodePudding user response:

Here are two versions of an XSLT stylesheet which will process the XML file you posted, one for which introduced a convenient xsl:for-each-group group-starting-with=pattern element for this use case, and, for maximum portability, one for using XPath to do the grouping. Both versions use doc/text as the logical root of the tree and xsl:apply-templates to make the most of the built-in template rules. Mind the whitespace handling.

More examples of flat file transformation at SO and the XSLT 1.0 FAQ, now at archive.org.


<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="doc/text">
    <chapter>
      <title>
        <xsl:apply-templates select="p[@style='TRH2']"/>
      </title>
      <research>
        <title>
          <xsl:apply-templates select="p[@style='TRRef']"/>
        </title>
        <reftext>
          <xsl:apply-templates select="p[@style='TRRefText']"/>
        </reftext>
      </research>
      <sections>
        <xsl:for-each-group 
          select="p[not(@style) or @style='TRH7']"
          group-starting-with="p[@style='TRH7']"
        >
          <title>
            <xsl:apply-templates select="self::p[1]"/>
          </title>
          <paragraphs>
            <xsl:for-each select="current-group()[self::p][position()>1]">
              <para-text>
                <xsl:apply-templates/>
              </para-text>
            </xsl:for-each> 
          </paragraphs>
        </xsl:for-each-group>
      </sections>
    </chapter>
  </xsl:template>


  <xsl:template match="p[@style='TRRefText']">
     <xsl:value-of select="."/><br/>
  </xsl:template>

  <xsl:template match="foot-note">
    <footnoteref>
      <id><xsl:value-of select="@id-rel"/></id>
      <xsl:apply-templates/>
    </footnoteref>
  </xsl:template>

</xsl:transform>

The XSLT 1.0 version (in the third xsl:template) uses an XPath expression to group the non-title p elements between current and next subsection title element (p[@style='TRH7']), and a mode="para" clause to avoid processing the title as both title and paragraph.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
>
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="doc/text">
    <chapter>
      <title>
        <xsl:apply-templates select="p[@style='TRH2']" />
      </title>
      <research>
        <title>
          <xsl:apply-templates select="p[@style='TRRef']" />
        </title>
        <reftext>
          <xsl:apply-templates select="p[@style='TRRefText'] "/>
        </reftext>
      </research>
      <sections>
        <xsl:apply-templates select="p[@style='TRH7']" />
      </sections>
    </chapter>
  </xsl:template>


  <xsl:template match="p[@style='TRRefText']">
     <xsl:value-of select="."/><br/>    
  </xsl:template>

  <xsl:template match="p[@style='TRH7']">
    <title><xsl:apply-templates/></title>
    <paragraphs>
      <xsl:apply-templates mode="para"
        select="following-sibling::p[not(@style='TRH7')]
               [generate-id(preceding-sibling::p[@style='TRH7'][1])
              = generate-id(current())]"
      />
    </paragraphs>
  </xsl:template>

  <xsl:template match="p" mode="para">
    <para-text><xsl:apply-templates/></para-text>
  </xsl:template>

  <xsl:template match="foot-note">
    <footnoteref>
      <id><xsl:value-of select="@id-rel"/></id>
      <xsl:apply-templates/>
    </footnoteref>
  </xsl:template>

</xsl:transform>
  • Related