Home > database >  XSLT output formatting with marc default namespace
XSLT output formatting with marc default namespace

Time:11-10

This is a minor cosmetic thing, I'm just curious what's going on under the water. I'm converting MARC21 XML to some other XML dialext using Saxonica transform.exe.

Here's a short sample of my output. The question is: how come OCLC_number has tags and content each on a separate line (with the closing tag backtabbed), whereas all the others have tags and contents all on one line? The latter looks better to me, but I know it's just a cosmetic thing.

    <?xml version="1.0" encoding="UTF-8"?>
    <AdlibXML xmlns:marc="http://www.loc.gov/MARC21/slim">
         <record>
                <OCLC_number>
                776125014
            </OCLC_number>
                <author>Lippmann, Harry.</author>
                <title>Deutsches Atlantik Wall Archiv : Register ... / Harry Lippmann.</title>
                <place_of_publication>Köln :</place_of_publication>
         </record>
    </AdlibXML>
    

Here's a sample input XML. In real life, it's a much larger export from WorldCat.

    <collection>
        <record xmlns="http://www.loc.gov/MARC21/slim">
            <datafield tag="034" ind1=" " ind2=" ">
                <subfield code="a">(OCoLC)776125014</subfield>
            </datafield>
            <datafield tag="100" ind1="1" ind2=" ">
                <subfield code="a">Lippmann, Harry.</subfield>
            </datafield>
            <datafield tag="245" ind1="1" ind2="0">
                <subfield code="a">Deutsches Atlantik Wall Archiv :</subfield>
                <subfield code="b">Register ... /</subfield>
                <subfield code="c">Harry Lippmann.</subfield>
            </datafield>  
            <datafield tag="260" ind1=" " ind2=" ">
                <subfield code="a">Köln :</subfield>
                <subfield code="b">Lippmann,</subfield>
                <subfield code="c">1996-....</subfield>
            </datafield>
        </record>
    </collection>
    

Here's a short version of my XSLT.

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0" 
        xmlns:marc="http://www.loc.gov/MARC21/slim" 
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        >
    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="collection">
    <AdlibXML>
        <xsl:apply-templates select="marc:record" />
    </AdlibXML>
    </xsl:template>

    <xsl:template match="marc:record">
        <!-- OCLC-number must not be empty -->
        <xsl:if test="marc:datafield[@tag=034] !=''" >
            <record>

                <OCLC_number>
                    <xsl:value-of select="translate(marc:datafield[@tag=034], '(OCoLC)', '')" />
                </OCLC_number>
                
                <author>
                    <xsl:value-of select="marc:datafield[@tag=100]/marc:subfield[@code='a']" />
                </author>

                <title>
                    <xsl:value-of select="marc:datafield[@tag=245]/marc:subfield[@code='a']" />
                    <xsl:if test="marc:datafield[@tag=245]/marc:subfield[@code='b'] != ''" >
                        <xsl:text> </xsl:text>
                        <xsl:value-of select="marc:datafield[@tag=245]/marc:subfield[@code='b']" /> 
                    </xsl:if>
                    <xsl:if test="marc:datafield[@tag=245]/marc:subfield[@code='c'] !=''" >
                        <xsl:text> </xsl:text>
                        <xsl:value-of select="marc:datafield[@tag=245]/marc:subfield[@code='c']" />
                    </xsl:if>
                </title>

                <place_of_publication>
                    <xsl:value-of select="marc:datafield[@tag=260]/marc:subfield[@code='a']" />
                </place_of_publication>

            </record>  
        </xsl:if>
    </xsl:template>
    </xsl:stylesheet>       

The XSLT works. I learned about default namespaces in the process of making it. In fact, I learned I had to use xmlns:marc="http://www.loc.gov/MARC21/slim". But while MARC21 itself is fully documented, I couldn't find any documentation about what this specific namespace is supposed to do or define.

CodePudding user response:

You need to select the subfield select="translate(marc:datafield[@tag=034]/marc:subfield" or use <xsl:strip-space elements="*"/>.

CodePudding user response:

You're copying the string value of the element

    <datafield tag="034" ind1=" " ind2=" ">
        <subfield code="a">(OCoLC)776125014</subfield>
    </datafield>

The string value of an element is the concatenation of all its descendant text nodes, and you didn't strip whitespace from the input, so this is all the whitespace before the <subfield>, followed by "(OCoLC)776125014", followed by all the whitespace after the <subfield> (the second whitespace is a bit shorter than the first, hence the jaggedness in the output). The serializer (with indent="yes") has a certain amount of freedom to adjust whitespace in the output, but not where it's the actual content of an element explicitly written to the result tree.

  • Related