Wednesday Mar 06, 2013

Chunked Step using Batch Applications in Java EE 7: Getting Started with GlassFish 4 (TOTD #211)

TOTD #192 explained the key concepts of JSR 352. This Tip Of The Day provides a working example of a how to write a simple chunk step using JSR 352 Reference Implementation integrated in GlassFish 4.

The source code for this sample application can be downloaded from here and works on GlassFish 4 b78.

As explained in TOTD #192, JSR 352 defines item-oriented processing using chunk step and task-oriented processing using batchlet step. A chunk consists of a reader that reads one item at a time, a processor that processes one item at a time, and a writer that aggregates 'chunk' number of items and then writes them out.

Here is an implementation of reader:

public class MyItemReader extends AbstractItemReader<MyInputRecord> {
    private final StringTokenizer tokens;
    public MyItemReader() {
        tokens = new StringTokenizer("1,2,3,4,5,6,7,8,9,10", ",");
    public MyInputRecord readItem() {
        if (tokens.hasMoreTokens()) {
            return new MyInputRecord(Integer.valueOf(tokens.nextToken()));
        return null;

Reader uses type information to specify the type of record that it is working on, MyInputRecord in this case. The readItem method returns null to indicate the end of items that can be read. In this case, a StringTokenizer is used to read the items but you can use an InputStream here if you like.

Here is an implementation of processor:

public class MyItemProcessor implements ItemProcessor<MyInputRecord,MyOutputRecord> {

    public MyOutputRecord processItem(MyInputRecord t) {
        System.out.println("processItem: " + t);
        return (t.getId() % 2 == 0) ? null : new MyOutputRecord(t.getId() * 2);

Processor uses type information to specify the type of input and output records its working on, MyInputRecord is the input record and MyOutputRecord is the output record in this case. The processItem method reads the input record and accepts only odd-numbered records. This is where your business logic would be implemented.

And here is an implementation of writer:

public class MyItemWriter extends AbstractItemWriter<MyOutputRecord> {

    public void writeItems(List<MyOutputRecord> list) {
        System.out.println("writeItems: " + list);

Writer uses type information as well to specify the type of output record, MyOutputRecord in this case. All these elements are tied together using "Job XML" as shown:

<job id="myJob" xmlns="http://batch.jsr352/jsl">
    <step id="myStep" >
        <chunk item-count="3">
            <reader ref="myItemReader"></reader>
            <processor ref="myItemProcessor"></processor>
            <writer ref="myItemWriter"></writer>

Eventually, the references myItemReader, myItemProcessor, and myItemWriter will be resolved using CDI. But for now, the references can be explicitly resolved using "batch.xml" as shown:

<batch-artifacts xmlns="">
    <ref id="myItemReader" class="org.glassfish.chunk.simple.MyItemReader"/>
    <ref id="myItemProcessor" class="org.glassfish.chunk.simple.MyItemProcessor"/>
    <ref id="myItemWriter" class="org.glassfish.chunk.simple.MyItemWriter"/>

Once again, downloaded the source code from here and get it running on GlassFish 4 b78.

Post feedback on or

Wednesday Jan 02, 2013

Batch Applications in Java EE 7 - Undertanding JSR 352 Concepts: TOTD #192

Batch processing is execution of series of "jobs" that is suitable for non-interactive, bulk-oriented and long-running tasks. Typical examples are end-of-month bank statement generation, end-of-day jobs such as interest calculation, and ETL (extract-transform-load) in a data warehouse. These tasks are typically data or computationally intensive, execute sequentially or in parallel, and may be initiated through various invocation models, including ad-hoc, scheduled, and on-demand.

JSR 352 will define a programming model for batch applications and a runtime for scheduling and executing jobs. This blog will explain the main concepts in JSR 352.

The diagram below highlight the key concepts of a batch processing architecture.

  • A Job is an instance that encapsulates an entire batch process. A job is typically put together using a Job Specification Language and consists of multiple steps. The Job Specification Language for JSR 352 is implemented with XML and is referred as "Job XML".
  • A Step is a domain object that encapsulates an independent, sequential phase of a job. A step contains all of the information necessary to define and control the actual batch processing.
  • JobOperator provides an interface to manage all aspects of job processing, including operational commands, such as start, restart, and stop, as well as job repository commands, such as retrieval of job and step executions.
  • JobRepository holds information about jobs current running and jobs that run in the past. JobOperator provides access to this repository.
  • Reader-Processor-Writer pattern is the primary pattern and is called as Chunk-oriented processing. In this, ItemReader reads one item at a time, ItemProcessor processes the item based upon the business logic, such as calculate account balance and hands it to ItemWriter for aggregation. Once the 'chunk' number of items are aggregated, they are written out, and the transaction is committed.

    JSR 352 also defines roll-your-own batch pattern, called as Batchlet. This batch pattern is invoked once, runs to completion, and returns an exit status. This pattern must implement and honor a "cancel" callback to enable operational termination of the Batchlet.
A Job XML for a chunk-oriented processing may look like:

<job id="myJob" xmlns="http://batch.jsr352/jsl">
<step id="myStep" >
<chunk reader="MyItemReader"
commit-interval="10" />

  • The <job> has an "id" attribute that defines the logical name of the job and is used for identification purposes.
  • Each <job> can multiple <step>s where each <step> identifies a job step and it's characteristics. Each <step> has an "id" attribute that defines the logical name of the job and is used for identification purposes.
  • A <step> may have <chunk> or <batchlet> element, this <step> has a <chunk>. A <chunk> identifies a chunk type step and implements the reader-processor-writer pattern of batch.
  • The "reader", "processor", and "writer" attributes specify the class names of an item reader, processor, and writer respectively.
  • "buffer-size" specifies number of items to read and buffer before writing. When enough items have been read to fill the buffer, the buffer is emptied to a list and the configured ItemWriter is invoked with the list of items.
  • "checkpoint-policy" attribute specifies the checkpoint policy that governs commit behavior for this chunk. Valid values are "item", "time" and "custom". The "item" policy means chunk is checkpointed after a specified number of items are processed. The "time" policy means the chunk is committed after a specified amount of time. The "custom" policy means the chunk is checkpointed according to a checkpoint algorithm implementation. The default policy is "item".
  • "commit-interval" specifies the commit interval for the specified checkpointed policy. The unit meaning of the commit-interval specifies depends on the specified checkpoint policy. For "item" policy, commit-interval specifies a number of items. For "time" policy, commit- interval specifies a number of seconds. The commit-interval attribute is ignored for "custom" policy.

    When the configured checkpoint policy directs it is time to checkpoint, all the items read and processed so far are passed to the "writer".

Here is a simple reader:

public class MyItemReader {
private static int id;
MyCheckPoint checkpoint = null;

void open(MyCheckPoint checkpoint) {
this.checkpoint = checkpoint;
System.out.println(getClass().getName() + ".open: " + checkpoint.getItemCount());

MyBatchRecord read() {
return new MyBatchRecord(++id);

MyCheckPoint getCheckPoint() {
return checkpoint;

Methods marked with @Open, @ReadItem, and @CheckpointInfo are required.

Here is a simple processor that rejects every other item:

public class MyItemProcessor {
MyBatchRecord process(MyBatchRecord record) {
return (record.getId() % 2 == 0) ? record : null;

And here is a simple writer:

public class MyItemWriter {
MyCheckPoint checkpoint = null;

void open(MyCheckPoint checkpoint) {
this.checkpoint = checkpoint;
System.out.println(getClass().getName() + ".open: " + checkpoint.getItemCount());

void write(List<MyBatchRecord> list) {
System.out.println("Writing the chunk...");
for (MyBatchRecord record : list) {
System.out.println("... done.");

MyCheckPoint getCheckPoint() {
return checkpoint;

Finally a simple implementation of MyCheckpoint:

public class MyCheckPoint {
int itemCount;

public int getItemCount() {
return itemCount;

public void setItemCount(int itemCount) {
this.itemCount = itemCount;

void incrementByOne() {

void increment(int size) {
itemCount += size;

Together, MyItemReader, MyItemWriter, MyItemProcessor, MyCheckPoint, and batch.xml, will read/process/write 5 items and commit when 10 such items have been processed.

JSR 352 specification defines several other concepts such as how Job XML can define sequencing of jobs, listeners to interpose on job execution, transaction management, and running jobs in partitioned and concurrent modes. Subsequent blog will explain some of those concepts.

A complete replay of Java Batch for Cost-Optimized Business Efficiency from JavaOne 2012 can be seen here (click on CON4105_mp4_4105_001 in Media).

Each feature will be added to the JSR subject to EG approval. You can share your feedback to

The APIs and implementation of JSR 352 are not integrated in GlassFish 4 promoted builds yet.

Here are some more references for you:

Here are some other Java EE 7 primers published so far:

And of course, more on their way! Do you want to see any particular one first ?


profile image
Arun Gupta is a technology enthusiast, a passionate runner, author, and a community guy who works for Oracle Corp.

Java EE 7 Samples

Stay Connected


« April 2014