PaaS Partner Community

  • June 26, 2016

Processing large XML files in the SOA Suite by Emiel Paasschens

Juergen Kress
PaaS Partner Adoption

clip_image002Read large XML files in chunks


At my current project, XML files are uploaded by the end-user to be processed in the Oracle SOA Suite. The XML files contain information about employers and their employees. Because an employer can have hundreds and even thousands of employees, these XML files can be quite large.
Processing such large XML files consumes a lot of memory and can be a bottleneck especially when multiple end users are uploading large XML files at the same time. It even can cause a server to crash because of an OutOfMemory problem.
The best way to solve is, is to read and process the large XML files in chunks, so read and process XML fragments instead of the full XML file.
My colleague, Aldo Schaap, already did and describes this for CSV files in his blog “Processing large files through SOA Suite using Synchronous File Read“. I thankfully used his blog to do the same for XML processing. However, a few things are slightly different in reading XML instead of CSV, so that’s the reason for this blog.
Another reason is that I ran into another problem, which I will describe later on in this blog. To be able to solve this problem I have to ‘pre transform’ the XML file. This means the XML file needs to be transformed before it is read by the SOA Suite. To achieve this I used the pre processing features of the file adapter with a custom (Java) valve. This pre en post processing is described in the blog “SOA Suite File Adapter Pre and Post processing using Valves and Pipelines” by Lucas Jellema.
The combination of these two blogs provided me the solution for my problem.

Problem Description

Back to my problem. The large XML files, which have to be parsed, contain one ‘Message’ element as root. This root element contains one or more employers with some basic employers information and each employer can contain multiple employee elements, up to thousands, with employee information and employment information. In the real use case the XML structure contains Dutch element names and the XML is very specific about the business problem. For the purpose of this blog, I’ve reduced the problem to a basic XML structure with English names and used some basic sample data. XSD source: Read the complete article here.

SOA & BPM Partner Community

For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center.

Blog Twitter LinkedIn image[7][2][2][2] Facebook clip_image002[8][4][2][2][2] Wiki

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.