Home > OS >  Merging *.fods using xsltproc to generate worbook with multiple sheets
Merging *.fods using xsltproc to generate worbook with multiple sheets

Time:12-20

I have several excel worksheets in unix environment. My goal is to merge them into one single workbook in unix environment itself.

I found a found solution using xsltproc which sort of works, but doesnt do the job completely. (merge mutliple excel files into one excel workbook but different worksheets using bash scripting)

Here's my current workflow as per the answer provided in the above link :

  1. Convert each *.xlsx sheet to *.fods files

soffice --headless --convert-to fods file*.xlsx

  1. Use xsltproc to merge *.fods sheets one by one, to get a workbook which will contain all sheets.
  2. Convert the *.fods workbook back to *.xlsx using below command :

soffice --headless --convert-to xslx outputfile*.fods

The place I'm stuck is at #2. The original answer in the link provides a xsltproc based solution which can merge two sheets. I am trying to extend it incrementally by below method :

xsltproc --stringparam secondfile file2.fods tablemerge.xsl file1.fods > int_2.fods

xsltproc --stringparam secondfile file3.fods tablemerge.xsl int_2.fods > final.fods

It works perfectly as long as only 2 sheets are to be combined, but behaves unexpectedly the moment I try to add file3.fods to the intermediate file int_2.fods.

The problem I see is the final.fods contains two copies of the worksheet in file3.fods .

I suspect its a problem with the tablemerge.xsl file, which contains contains some xml syntax (below) towards the end. The issue is I dont have any clue how xml works, but the syntax seems not very complicated. Any help to suggest what modification is required in below code will be very helpful. Thanks in advance.

tablemerge.xsl:

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
  xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0"
  xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
  xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0"
  xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"
  xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
  xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0"
  xmlns:presentation="urn:oasis:names:tc:opendocument:xmlns:presentation:1.0"
  xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0"
  xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0"
  xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0"
  xmlns:math="http://www.w3.org/1998/Math/MathML"
  xmlns:form="urn:oasis:names:tc:opendocument:xmlns:form:1.0"
  xmlns:script="urn:oasis:names:tc:opendocument:xmlns:script:1.0"
  xmlns:config="urn:oasis:names:tc:opendocument:xmlns:config:1.0"
  xmlns:ooo="http://openoffice.org/2004/office"
  xmlns:ooow="http://openoffice.org/2004/writer"
  xmlns:oooc="http://openoffice.org/2004/calc"
  xmlns:dom="http://www.w3.org/2001/xml-events"
  xmlns:xforms="http://www.w3.org/2002/xforms"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:rpt="http://openoffice.org/2005/report"
  xmlns:of="urn:oasis:names:tc:opendocument:xmlns:of:1.2"
  xmlns:xhtml="http://www.w3.org/1999/xhtml"
  xmlns:grddl="http://www.w3.org/2003/g/data-view#"
  xmlns:tableooo="http://openoffice.org/2009/table"
  xmlns:drawooo="http://openoffice.org/2010/draw"
  xmlns:calcext="urn:org:documentfoundation:names:experimental:calc:xmlns:calcext:1.0"
  xmlns:loext="urn:org:documentfoundation:names:experimental:office:xmlns:loext:1.0"
  xmlns:field="urn:openoffice:names:experimental:ooo-ms-interop:xmlns:field:1.0"
  xmlns:formx="urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:form:1.0"
  xmlns:css3t="http://www.w3.org/TR/css3-text/" 
  office:version="1.2" 
  office:mimetype="application/vnd.oasis.opendocument.spreadsheet">

  <xsl:template match="table:table">
    <!-- copy table:table from main file -->
    <xsl:copy>
      <xsl:apply-templates select="@* | node()" />
    </xsl:copy>
    <table:table 
      table:name="{document($secondfile)//table:table/@table:name}" 
      table:style-name="{document($secondfile)//table:table/@table:style-name}">
    <!-- copy table:table from second file -->
      <xsl:copy-of select="document($secondfile)//table:table/child::*" />
    </table:table>
  </xsl:template>

   <!-- default template: identity transform -->
   <xsl:template match="/ | @* | node()">
     <xsl:copy>
       <xsl:apply-templates select="@* | node()" />
     </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Edit1

Not sure how to completely describe what i see but i'll give it a try.

I manually created a workbook with two sheets, and saved it as xlsx, followed by conversion to *.fods. Then i compare this new fods to the one which is generated from single worksheet. It does seem to indicate to me what exactly i need.

Inside the fods file, i see the below :

#1. In fods from single sheet , I see only one table:table section :

....
  <office:spreadsheet>
   <table:calculation-settings table:case-sensitive="false" table:automatic-find-labels="false" table:use-regular-expressions="false" table:use-wildcards="true"/>
   <table:table table:name="Sheet1" table:style-name="ta1">
    ........
   </table:table>
   <table:named-expressions/>
  </office:spreadsheet>
...

#2. In *.fods from two sheets , I see only two table:table sections :

....
  <office:spreadsheet>
   <table:calculation-settings table:case-sensitive="false" table:automatic-find-labels="false" table:use-regular-expressions="false" table:use-wildcards="true"/>
   <table:table table:name="Sheet1" table:style-name="ta1">
    ........
   </table:table>
   <table:table table:name="Sheet2" table:style-name="ta2">
    .......
   </table:table>
   <table:named-expressions/>
  </office:spreadsheet>
...

I guess what i'm looking for is a tablemerge.xsl which can help me to merge two *.fods as in #1, and get something similar to *.fods in #2. Ofcourse the complication is that one of the *.fods being input can contain multiple sheets, and the final output *.fods should just concatenate the table:table section from second file into the first *.fods.

CodePudding user response:

This is mainly guesswork because we don't know the exact specification for a .fods file - and you're not even showing us a complete example. Try perhaps:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0">

<xsl:param name="secondfile"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="office:spreadsheet">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
        <xsl:copy-of select="document($secondfile)//table:table" />
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

This may or may not work depending on what role the other child elements of office:spreadsheet (namely table:calculation-settings and table:named-expressions) play in this, as well as other factors that I am not able to predict.

I am not even sure how you can test reliably that the result "works"; you may get the sheets from the other files, but they may be missing some functionality.

  • Related