Large File Processing With JCA File Adapter and OSB

I was asked to prototype a possible use case where a huge file (10+GB) needed to be processed by OSB.  Yes, this is a horrible thing, but sometimes we have to architect horrible things!  This use case had a couple of important aspects:

1. The file contained many repeating records of the same type.  Each record was pretty small.

2. The records in the file were not related and their order didn't matter.

This is important because if there are relationships or order requirements in the file, some more considerations would be needed.  I don't go into it here, but it would be possible to attach a single threaded work manager to the OSB Proxy and send the messages into a JMS queue using WebLogic Unit of Order capabilities.

So how can we handle this?

We can't use the file transport because if we touch the payload or loop over the records, we end up bringing the whole message into memory.  I don't really want to setup an 11GB JVM!

We could just write some java code to break up the file, but it would be nice to manage this in OSB.  We could pass the filename into OSB and use a java callout to break the message apart, but I'd be worried about how long this might take and the result of a proxy hanging around that long on a java callout.

So, let's use the JCA File Adapter that comes with SOA Suite.

Step 1 - Setup the environment and make a schema definition.

Step 2 - Create JCA File Adapter

Step 3 - Bring JCA File Adapter into OSB - Make Proxy

Step 4 - Initial testing to output files.

Step 5 - Testing large files.

Step 6 - Testing 11GB file.

Comments:

For those who need the perl script to generate the large files:

---
#!/usr/bin/perl -w

if($#ARGV != 1) {
print "Usage: run.pl <recordCount> <outputFile>\n";
exit(-1);
}
$count = $ARGV[0];
$outputFile = $ARGV[1];
print "Count: ".$count."\n";

open(OUTFD,"> ".$outputFile) || die "Can't create output file\n";

print OUTFD "<largeFile>\n";
for($i=0;$i<$count;$i++) {
print OUTFD " <record>\n";
print OUTFD " <title>Record ".($i+1)."</title>\n";
print OUTFD " <data>\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,\n";
print OUTFD " lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData,lotsOfData\n";
print OUTFD " </data>\n";
print OUTFD " </record>\n";
}
print OUTFD "</largeFile>\n";
close(OUTFD);
--

Posted by John Graves on November 10, 2014 at 10:49 AM EST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

me

Search

Categories
Archives
« April 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today