Home > Software engineering >  Pentaho: why does XML Output block puts out CSV?
Pentaho: why does XML Output block puts out CSV?

Time:12-02

Good with XML but beginner with Pentaho I am stumbling connecting the pieces.

Overall goal is to run an SQL query and format the output using XSLT into a nice email body. I use the XML Output block to get the SQL query result in XML, that is, it saves in XML format to a file, but it copies the data as CSV to the next block. I found this out by hooking up a temporary Text File Output block.

Why would a block named XML Output produce CSV output?

How can I get the XML to the XSL Translation block without needing to save to disk file in between?

Are there simpler ways of getting a nice email message body from an SQL SELECT query?

Here is the Pentaho Transformation where Report Totals does a simple SQL SELECT:

enter image description here

CodePudding user response:

All steps in PDI output rows of data, with the same structure (field names, data types, etc.).

Each step is responsible to picking up those rows of data and doing something with them.

Your XML output step will pick up the rows of data coming from the table input step and write them as XML to an external file. But it looks like what you want is to create an XML field to use later in the XSL transformation step and then email. The step you're looking for is probably Add XML, not XML output.

Add XML will take the incoming rows and produce a set of XML rows as strings. Those can then be further manipulated by grouping them together into a single row, then inserted into a root XML element, then transformed and appended to the email body.

You can right click each of those steps and click on Preview to see what data is going out of each step, it should help you make sense of PDI's internals.

The Text file output outputs whatever data comes in as a CSV. As no XML fields are coming in, no XML is written out (the XML Output doesn't add a XML field to the data stream, only writes to an external file).

  • Related