X

Antony Reynolds' Blog

Structure in a Flat World

Antony Reynolds
Senior Director Integration Strategy

Adding Structure to Flat XML Documents

A friend recently was wondering how to convert a flat document structure to a more structured form.

The type of flat structure is shown in the diagram below:

The deptNo and deptName fields repeat for each employee in the department.

This would be better represented as a structured format like the one shown below:

 

Note that the department details are now represented once per department and employees appear in a sequence called emp.  This is a more natural representation and easier to manipulate elsewhere.

So the question is, how do I get from the flat schema to the structured schema?

The answer lies in the preceding-sibling and following-sibling XPath axis.

To get just the first time a department appears we select all the entries that do not have the same deptNo earlier in the document using this XPath expression:

<xsl:for-each select="/ns1:collection/ns1:entry[not(ns1:deptNo = preceding-sibling::ns1:entry/ns1:deptNo)]">

Within the first occurrence of a department we then set a variable to hold the department number:

<xsl:variable name="DeptNo" select="ns1:deptNo"/>

Within the department we then put in the employee included in the current node.  We then select all the other entries that have the same department number and add their employee details by using the following XPath expression:

<xsl:for-each select="following-sibling::ns1:entry[ns1:deptNo = $DeptNo]">

A sample JDeveloper project to test this is available here.

Join the discussion

Comments ( 1 )
  • Geoff Thursday, October 20, 2011

    If you have the luxury of using XSLT 2, the for-each-group construct provides a simpler way to achieve this result:

    <?xml version="1.0" encoding="ISO-8859-1" ?>

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

    xmlns:flat="http://antony.blog/flat"

    xmlns:struct="http://antony.blog/structured">

    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

    <xsl:template match="/">

    <struct:collection>

    <xsl:for-each-group select="//flat:entry" group-by="flat:deptNo">

    <struct:dept>

    <struct:deptNo>

    <xsl:value-of select="current-grouping-key()"/>

    </struct:deptNo>

    <struct:deptName>

    <xsl:value-of select="flat:deptName"/>

    </struct:deptName>

    <struct:emps>

    <xsl:for-each select="current-group()">

    <struct:emp>

    <struct:empNo>

    <xsl:value-of select="flat:empNo"/>

    </struct:empNo>

    <struct:empName>

    <xsl:value-of select="flat:empName"/>

    </struct:empName>

    </struct:emp>

    </xsl:for-each>

    </struct:emps>

    </struct:dept>

    </xsl:for-each-group>

    </struct:collection>

    </xsl:template>

    </xsl:stylesheet>


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.