XSL is a language for expressing style sheets. An XSL style sheet is, like with CSS, a file that describes how to display an XML document of a given type. Therefore using xml I want convert complete XML into simple XML.
I am getting the XML file from the ABBYY FineReader which is too complex. All I need to convert it into simplified XML. I have made a XSL file to transform the src.xml to target.xml. But I am not getting the correct expected output file.
If anyone have any idea regarding this please help me as soon as possible.
Here is Complex XML file which I want to convert into simplified XML.
Source Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<document xmlns="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml" version="1.0" producer="ABBYY FineReader Engine 12" languages="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml">
<page width="294" height="189" resolution="120" originalCoords="1">
<block blockType="Text" blockName="" l="0" t="5" r="272" b="185"><region><rect l="0" t="5" r="272" b="185"/></region>
<text>
<par lineSpacing="2410">
<line baseline="30" l="1" t="6" r="72" b="30"><formatting lang="EnglishUnitedStates">hello</formatting></line></par>
<par lineSpacing="1840">
<line baseline="87" l="0" t="69" r="179" b="87"><formatting lang="EnglishUnitedStates">this is a website</formatting></line></par>
<par lineSpacing="1260">
<line baseline="136" l="0" t="122" r="269" b="140"><formatting lang="EnglishUnitedStates">Is the writing getting smaller?</formatting></line></par>
<par lineSpacing="1260">
<line baseline="182" l="0" t="169" r="133" b="182"><formatting lang="EnglishUnitedStates">IM SHRINKING</formatting></line></par>
</text>
<text>
<par lineSpacing="2410">
<line baseline="30" l="1" t="6" r="72" b="30"><formatting lang="EnglishUnitedStates">10</formatting></line></par>
<par lineSpacing="1840">
<line baseline="87" l="0" t="69" r="179" b="87"><formatting lang="EnglishUnitedStates">20</formatting></line></par>
<par lineSpacing="1260">
<line baseline="136" l="0" t="122" r="269" b="140"><formatting lang="EnglishUnitedStates">30</formatting></line></par>
<par lineSpacing="1260">
<line baseline="182" l="0" t="169" r="133" b="182"><formatting lang="EnglishUnitedStates">40</formatting></line></par>
</text>
</block>
</page>
<page width="294" height="189" resolution="120" originalCoords="1">
<block blockType="Text" blockName="" l="0" t="5" r="272" b="185"><region><rect l="0" t="5" r="272" b="185"/></region>
<text>
<par lineSpacing="2410">
<line baseline="30" l="1" t="6" r="72" b="30"><formatting lang="EnglishUnitedStates">hii</formatting></line></par>
<par lineSpacing="1840">
<line baseline="87" l="0" t="69" r="179" b="87"><formatting lang="EnglishUnitedStates">Demo for XSL</formatting></line></par>
</text>
</block>
</page>
</document>
Desired output
Here is the simplified XML which I want
<?xml version="1.0" encoding="UTF-8"?>
<document>
<page>
<block blockType="Text">
<text>
<paragraph>
<line>hello</line>
</paragraph>
<paragraph>
<line>this is a website</line>
</paragraph>
<paragraph>
<line>Is the writing getting smaller?</line>
</paragraph>
<paragraph>
<line>IM SHRINKING</line>
</paragraph>
</text>
<text>
<paragraph>
<line>10</line>
</paragraph>
<paragraph>
<line>20</line>
</paragraph>
<paragraph>
<line>30</line>
</paragraph>
<paragraph>
<line>40</line>
</paragraph>
</text>
</block>
</page>
<page>
<block blockType="Text">
<text>
<paragraph>
<line>hii</line>
</paragraph>
<paragraph>
<line>Demo for XSL</line>
</paragraph>
</text>
</block>
</page>
</document>
XSL Code
Here is the XSL from which we convert Complex XML into simple XML
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xpath-default-namespace="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<document>
<page>
<block>
<xsl:variable name="blockType" select="/document/page/block/@blockType"/>
<!-- The variable blockType can be used for further processing. -->
<xsl:attribute name="blockType"><xsl:value-of select="$blockType"/></xsl:attribute>
<xsl:for-each select="/document/page/block/text">
<text>
<xsl:for-each select="/document/page/block/text/par">
<paragraph>
<line>
<xsl:value-of select="./line"/>
</line>
</paragraph>
</xsl:for-each>
</text>
</xsl:for-each>
</block>
</page>
</document>
</xsl:template>
</xsl:stylesheet>
Actual output
<?xml version="1.0" encoding="UTF-8"?>
<document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<page>
<block blockType="Text Text">
<text>
<paragraph>
<line>hello</line>
</paragraph>
<paragraph>
<line>this is a website</line>
</paragraph>
<paragraph>
<line>Is the writing getting smaller?</line>
</paragraph>
<paragraph>
<line>IM SHRINKING</line>
</paragraph>
<paragraph>
<line>10</line>
</paragraph>
<paragraph>
<line>20</line>
</paragraph>
<paragraph>
<line>30</line>
</paragraph>
<paragraph>
<line>40</line>
</paragraph>
<paragraph>
<line>hii</line>
</paragraph>
<paragraph>
<line>Demo for XSL</line>
</paragraph>
</text>
<text>
<paragraph>
<line>hello</line>
</paragraph>
<paragraph>
<line>this is a website</line>
</paragraph>
<paragraph>
<line>Is the writing getting smaller?</line>
</paragraph>
<paragraph>
<line>IM SHRINKING</line>
</paragraph>
<paragraph>
<line>10</line>
</paragraph>
<paragraph>
<line>20</line>
</paragraph>
<paragraph>
<line>30</line>
</paragraph>
<paragraph>
<line>40</line>
</paragraph>
<paragraph>
<line>hii</line>
</paragraph>
<paragraph>
<line>Demo for XSL</line>
</paragraph>
</text>
<text>
<paragraph>
<line>hello</line>
</paragraph>
<paragraph>
<line>this is a website</line>
</paragraph>
<paragraph>
<line>Is the writing getting smaller?</line>
</paragraph>
<paragraph>
<line>IM SHRINKING</line>
</paragraph>
<paragraph>
<line>10</line>
</paragraph>
<paragraph>
<line>20</line>
</paragraph>
<paragraph>
<line>30</line>
</paragraph>
<paragraph>
<line>40</line>
</paragraph>
<paragraph>
<line>hii</line>
</paragraph>
<paragraph>
<line>Demo for XSL</line>
</paragraph>
</text>
</block>
</page>
</document>
CodePudding user response:
Why don't you do simply:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/document">
<document>
<xsl:for-each select="page">
<page>
<xsl:for-each select="block">
<block blockType="{@blockType}">
<xsl:for-each select="text">
<text>
<xsl:for-each select="par">
<paragraph>
<line>
<xsl:value-of select="line"/>
</line>
</paragraph>
</xsl:for-each>
</text>
</xsl:for-each>
</block>
</xsl:for-each>
</page>
</xsl:for-each>
</document>
</xsl:template>
</xsl:stylesheet>
Or perhaps even simpler:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="http://www.abbyy.com/FineReader_xml/FineReader10-schema-v1.xml">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="block">
<block blockType="{@blockType}">
<xsl:apply-templates select="text"/>
</block>
</xsl:template>
<xsl:template match="par">
<paragraph>
<line>
<xsl:value-of select="line"/>
</line>
</paragraph>
</xsl:template>
</xsl:stylesheet>