Getting data out of Excel to make a nice document
By mdonohue on Nov 11, 2009
Want to thank Srikant Subramaniam from the Oracle Fusion Middleware for Apps team for prompting this post.
As much as we here at Oracle would like to think that all data is in a database -- an Oracle Database at that, the fact is that sometimes ... dare I say it ... data is kept in Excel spreadsheets.
Srikant's team maintains some information in an Excel spreadsheet. To get this data into a nicely formatted PDF file was a tedious process that took 5 to 6 days. Srikant wanted to know if Publisher could help. The challenge was how to get the data out of the Excel spreadsheet into an XML file. After a little searching on the web, we worked out a way to do that.
Now his team can generate the desired output in a couple of hours. Over 20 times faster than it used to take. Saving at least 38 hours of precious time that can now be put toward other projects.
Srikant was kind enough to document this in a nice tutorial. The tutorial shows how to convert data from an Excel spreadsheet into a PDF document. The Excel data consists of a set of addresses, that are then grouped and sorted using the BI Publisher Template Builder to create the RTF template and the output PDF file. Here are the supporting files to run the tutorial.
Knowing that dates and nulls can sometimes be problematic, I took it a little further to test out an Excel spreadsheet with dates and nulls. Fortunately the NULL values are converted nicely and those elements with NULL values are absent in the resulting XML - just the way you want it so that values can be aggregated correctly.
The Dates however proved to be a little more challenging. In the XML data, they kept coming out as Excel number values and not in date canonical format. I was also using Excel 2007 so the process was a little different from what Srikant documented in the tutorial. Here are the differences for Excel 2007 and what I found I had to do to get Dates to come out in the XML correctly. Perhaps someone more clever has a better way ...
-- When converting the range to an XML List I was warned/prompted to use format or data type when mapping. When I chose data type (which I thought would work) I ended up with dates as the Excel number values in the spreadsheet and the exported XML data. So I went with "Use existing formatting" for the the mapping so the dates would still appear as dates in the spreadsheet.
However, exporting to XML still left the dates as Excel number values.
-- The "trick" is that you must create and apply a new "XML Map" (this appears to be the Excel term for XSD) that tells Excel that the date is to be mapped to an XML date format. The first step is to generate the XSD file. Set the focus in the data range and select "Create XSD files for the XML Schema at the active cell" from the XML Tools Add-in menu.
-- In the XSD you will see that the Date field is being treated as a string.
Modify it so that it's type is Date.
-- Upload the XSD as a new XML Map and map the field to this new XML map. Set focus on a cell in the region, right click and select the "XML Source" menu choice from the XML menu.
In the XML Source dialog click on the [XML Maps...] button. Add the XSD you just modified and rename it to something meaningful. Then delete the Root_Map that was there.
Click OK to return the XML Source dialog and drag and drop the Map elements onto the correct columns headers in the data range.
-- Now when you export to XML, the dates will be in canonical format and can be formatted correctly by the Template Builder or other XSL formatting tools.