How to generate Excel Spreadsheet from XML data in process variables

Added by Kendy Yus, last edited by Kendy Yus on Aug 17, 2009  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

For the full reference of the Spreadsheet Util DSC related to this article, please click here.

Background

Here is the business scenario where you need to generate a report in an Excel Spreadsheet format containing data in tabulated form. One way is to generate a CSV file using Avoka SQL Plus service operation that does exporting data from database into a CSV file (For more information about this, please refer to Retrieving data from a database and saving as CSV file article). Alternatively, use SpreadSheet Util DSC with operation Generate Spreadsheet to generate an Excel Spreadsheet from process data. The operation Generate Spreadsheet is also capable of handling repeating element and filtering based on XML data.

Key features for Spreadsheet Util - Generate Spreadsheet:

  1. It allows the specification of spreadsheet contents at design time using XML configuration file. More sample will be provided in this article
  2. It supports literal value as well as dynamic reference to process variables or both using XPath Expression 
  3. It supports selective repeating elements processing on process variable
  4. It is capable of specifying region for merged cells
  5. It dynamically works out the row position if the intended row has been occupied due to repeating row processing and will use the next available row instead, i.e. push downward. 

What it does not do currently:

  1. Cell formatting like font, color, and column width.

XML Mapping File to Determine Spreadsheet Contents and Structure

The spreadsheet generator operation uses an XML file to define the structure of the spreadsheet as well as its contents. The schema of the XML mapping file is provided at the end of this article.

The mapping XML specify what are the sheets and rows and columns that together make up a workbook or an excel spreadsheet.

Here are a few sample XML mapping file:

How to Create a Sheet and Multiple Sheets in a Spreadsheet

To create a sheet within a workbook with an empty content, below is the XML mapping file:

<workbook>
     <sheet name="Generated Sheet">
     </sheet>
 </workbook>

To create more than one sheet within a workbook, you simply add more <sheet> tag under <workbook>:

<workbook>
     <sheet name="Generated Sheet">
     </sheet>
     <sheet name="Generated Sheet 2">
     </sheet>
     <sheet name="Generated Sheet 3">
     </sheet>
 </workbook>

How to Create a Sheet with Multiple Rows with Static Data

To create a sheet with some rows that contains some static data, here is the XML mapping file:

<workbook>
     <sheet name="Generated Sheet">
	<row loc="0">
           <column loc="0" >
	      <value>First column</value>
	      <type>string</type>
	   </column>
           <column loc="1" >
	      <value>Second column</value>
	      <type>string</type>
	   </column>
           <column loc="2" >
	      <value>Number column</value>
	      <type>222</type>
	   </column>
           <column loc="3" >
	      <value>25/12/2008</value>
	      <type>date</type>
	      <dateFormat>dd/MM/yyyy</dateFormat>
	   </column>
	</row>
     </sheet>
 </workbook>

How to Create Merged Cell Region

To create a merged cell region with text, here is the sample mapping file:

<workbook>
     <sheet name="Generated Sheet">
	<row loc="0">
  	   <column loc="0" mergedColumns="true" endLoc="5" endRow="1">
	      <value>Merged Region</value>
	      <type>string</type>
	   </column>
	</row>
     </sheet>
 </workbook>

Quick Way to Specify Columns in a Row

There are times where you have a lot of columns in a row. A quicker way to specify columns in a row would be via delimited string using a pipe ('|') as column separator. Here is the sample of how you could do it:

<workbook>
    <sheet name="Generated Sheet">
        <row loc="2" concatRow="true" columnStartLoc="0">
            <columnText>
                header1|header2|header3|header4|header5|header6
            </columnText>
        </row>
    </sheet>
</workbook>

For specifying an empty column between two occupied columns, simply use an empty space character, e.g. header1| |header3.

How to Create Multiple Rows Based on Repeating XML Element

When specifying repeating rows via iterating a repeating section of an XML, there are a couple of attributes that you need to specify, i.e. "unbound" to be true, and the "repeatingElement" which specifies the repeating section of the XML data.

<workbook>
    <sheet name="Generated Sheet">
	<row loc="3" unbound="true" repeatingElement="{$ /process_data/xml_var/root/adult/kids/kid[age='5'] $}">
  	   <column loc="0">
	      <value>{$firstname$}</value>
	      <type>string</type>
	   </column>
  	   <column loc="1">
	      <value>{$favourite_toy$}</value>
	      <type>string</type>
	   </column>
  	   <column loc="2">
	      <value>{$physical_attribute/height$}</value>
	      <type>numeric</type>
	   </column>
  	   <column loc="3">
	      <value>{$physical_attribute/weight$}</value>
	      <type>numeric</type>
	   </column>
  	   <column loc="4">
	      <value>{$../../lastname$}</value>
	      <type>numeric</type>
	   </column>
        </row>
    </sheet>
</workbook>

Note: It supports query within XPath expression in the repeating element. In the example above, it only selects repeating section where the kid's age is 5. And also you can use ".." to navigate to the parent element in the XML tree of the XML data.

Here you can also use the quick way of specifying columns in a row without worrying the format of the data:

<workbook>
    <sheet name="Generated Sheet">
	<row loc="3" unbound="true" repeatingElement="{$ /process_data/xml_var/root/adult/kids/kid[age='5'] $}" concatRow="true" columnStartLoc="0">
  	   <columnText>
                {$firstname$}|{$favourite_toy$}|{$physical_attribute/height$}|{$physical_attribute/weight$}
	   </columnText>
        </row>
    </sheet>
</workbook>






What about Repeating XML Element within a Repeating Section

If the repeating element that you want to select belongs to a repeating section itself, and you want to restrict/filter the result based on its parent element, here is an example:

<workbook>
    <sheet name="Generated Sheet">
	<row loc="3" unbound="true" repeatingElement="{$ /process_data/xml_var/root/adult/kids/kid[age='5'] $}">
           <filters>
              <filter>
                  <element>{$../../lastname$}</element>
                  <value>Brown</value>
              </filter>
           </filters>
  	   <column loc="0">
	      <value>{$firstname$}</value>
	      <type>string</type>
	   </column>
  	   <column loc="1">
	      <value>{$favourite_toy$}</value>
	      <type>string</type>
	   </column>
  	   <column loc="2">
	      <value>{$physical_attribute/height$}</value>
	      <type>numeric</type>
	   </column>
  	   <column loc="3">
	      <value>{$physical_attribute/weight$}</value>
	      <type>numeric</type>
	   </column>
  	   <column loc="4">
	      <value>{$../../lastname$}</value>
	      <type>numeric</type>
	   </column>
        </row>
    </sheet>
</workbook>

The above example will only retrieve kids that are 5 years old with parent with the last name of "Brown".

Putting It Together In LiveCycle Workflow

Now, once you have the XML mapping file, you are ready to generate a spreadsheet in your workflow.

First, Spreadsheet Util DSC needs to be installed to your LiveCycle server via Workbench in the Components View. Once installed, you need to start/activate it.

The Spreadsheet Util service will now appear on the Service View of the Workbench, with one operation "Generate spreadsheet" ready to be used in a workflow.

Here is a sample of how the workflow will look like:


In the sample workflow above, I read the XML data file into an XML variable (using read XML of File Utils),  read the mapping rule file into a variable, and finally I use the mapping rule to call generate spreadsheet.
 
Here is the property setting of the generate spreadsheet step:


 
As the result of executing the above step in the workflow, it will generate a spreadsheet and save it on the specified output file path location.

Please click here to download an LCA file containing a sample of generating Spreadsheet from XML.


If you want to have a go yourself with generating a dynamic spreadsheet in a workflow, the Spreadsheet Util DSC will be made available shortly via our website http://www.avoka.com/avoka/escomponents.shtml. But in the mean time, you could check out what are the other useful DSC in store.

We would appreciate any feedback or suggestions if you find there are other operations you would like to have for your workflow with regards to generating a spreadsheet.


XML Schema For the Mapping File

Here is the schema of the XML mapping file:

 

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  elementFormDefault="qualified" attributeFormDefault="unqualified">
  <xs:element name="workbook">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="sheet" maxOccurs="unbounded">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="row" maxOccurs="unbounded">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="column" maxOccurs="unbounded">
                      <xs:complexType>
                        <xs:sequence>
                          <xs:element name="value"/>
                          <xs:element name="type"/>
                        </xs:sequence>
                        <xs:attribute name="loc" use="required" type="xs:integer"/>
                        <xs:attribute name="mergedColumns" type="xs:boolean"/>
                        <xs:attribute name="endLoc" type="xs:integer"/>
                        <xs:attribute name="endRow" type="xs:integer"/>
                      </xs:complexType>
                    </xs:element>
                    <xs:element name="columnText"/>
                  </xs:sequence>
                  <xs:attribute name="loc" use="required" type="xs:integer"/>
                  <xs:attribute name="concatRow" type="xs:boolean"/>
                  <xs:attribute name="columnStartLoc" type="xs:integer"/>
                  <xs:attribute name="unbound" type="xs:boolean"/>
                  <xs:attribute name="repeatingElement" type="xs:string"/>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
            <xs:attribute name="name" use="required" type="xs:string"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Please note that row element attribute where it can be specified to be "unbound", allowing processing of repeating element referencing to an XML data specified in its "repeatingElement" attribute.