Thanks to Sachin Thatte for contributing this article!
Oracle recently introduced the latest release of Oracle Data Integrator (ODI) Enterprise Edition 11g the summer of 2010. With this offering, Oracle raised the bar on performance, scalability and highly availability for data movement and transformation solution. With the unique E-LT (Extract, Load and Transform) approach, the total cost of ownership for ODI is a fraction of its competition.
In this article we will show how to take advantage of ODI's E-LT data movement and transformation capabilities in a highly scalable and highly available way.
ODI performs the execution and orchestration of the E-LT jobs via the lightweight ODI runtime Agent. The ODI Agent can be deployed in a standalone environment as well as on the industry leading WebLogic Server. It is with the WebLogic server deployment that ODI achieves scalability and high availability. For deploying on WebLogic server ODI ships with a JEE application and a domain configuration deployment template to assist in configuration of the agent. ODI also supports using Oracle RAC database for storing ODI Master and Work Repository data.
The following is a depiction of ODI agent deployed in a cluster, connected to Oracle RAC repository.
By deploying ODI Agent in a clustered WebLogic server, the incoming requests can be supported by a farm of machines each able to take on a slice of the incoming requests. The Proxy or Load Balancer that receives all the incoming requests can intelligently distribute the load across the managed servers to maximize the capacity. Depending upon your business needs the managed servers on which ODI Agents are deployed can be increased or decreased without disrupting your business.
It is recommended that the ODI Repositories (Master and Work) should be deployed on Oracle RAC for load balancing and high availability on the database side. Using Oracle RAC will allow ODI to retry failed connections in case one of the RAC node goes down and continue the execution of running ODI Sessions without failing the E-LT tasks.
ODI agent supports an internal scheduling service. This scheduling service is used to execute the E-LT jobs automatically based on the schedule that a user can define. The schedule for the job can be one time or recurring. There are various fault handling options to handle any unforeseen errors that can be caused due to environmental problems. This scheduling service runs as a singleton when deployed on a WebLogic cluster. If for any reason the managed server where the Scheduling service is deployed goes down or is brought down for maintenance, the scheduling service automatically migrates to one of the available managed servers in the cluster. Thus it provides a fail-safe and highly available scheduling service for executing ODI schedules.
How do I set up ODI for HA?
- Configure a named ODI Agent in ODI Studio
ODI Agents are declared and defined in ODI Studio Topology panel. Define the ODI Agent that will be used in WebLogic cluster for high availability and for scalable deployment. The host/port defined in the Agent configuration must match with the Load Balancer host/port address. All ODI Agent requests will be routed through this host/port address.
- Generate ODI Agent dynamic template for deployment
Using ODI Studio, generate a deployment template for ODI Agent. When generating the deployment template, you can choose the Data Servers that should get deployed as JEE Data Sources so that they are managed and pooled via WebLogic configuration.
- Deploy ODI Agent using the dynamically generated Agent deployment template.
Use WebLogic configuration wizard to deploy the template generated in previous step to a WebLogic cluster. This will allow you to create the set of managed servers that are part of a WebLogic cluster to which the ODI Agent should be deployed. You can also deploy the ODI Agent template on an existing WebLogic Domain.
- Configure Coherence cache properties in JAVA_OPTIONS for managed server startup.
- tangosol.coherence.localport configuration parameter defines the port which a node in the cluster can use for coherence cluster. It would be pinged by an agent nodes to detect coherence cluster existence and other coherence communication.
All the ODI Agents deployed on a cluster must be connected to the same Coherence cluster cloud. This enables the agents to share the knowledge of the tasks performed by each of them as well as allow for the Scheduling Service migration when needed. Following properties are introduced to configure the Coherence listen addresses.
oracle.odi.coherence.wkaN : The host name of a Managed Server
oracle.odi.coherence.wkaN.port : Coherence Unicast port configured on that Managed Server
Where N = 1..10
Node 1: "-Dtangosol.coherence.localport=8095
Node 2: "-Dtangosol.coherence.localport=8096
Such an ODI agent deployment will be highly available and will allow scalability to address the load on the Agent based on your business needs.
There is a lot more to be leveraged from a highly available cluster deployed ODI Agent. ODI also supports ODI Master and Work repositories deployed in Oracle RAC configuration to ensure that repository connectivity is always available. This article gets you introduced and quickly up to speed on this new feature of ODI 11g. You can read all about High Availability for Oracle Data Integrator in the documentation here
. You can also read details on Coherence and how to configure Coherence here