Thursday Oct 29, 2015

Dynamic Debug Patches in WebLogic Server 12.2.1


Whether we like it not, we know that no software is perfect. Bugs happen, in spite of the best efforts by the developers. Worse, in many circumstances, they show up in unexpected ways. They can also be intermittent and hard to reproduce in some cases. In such cases, there is often not enough information even to understand the nature of the problem if the product is not sufficiently instrumented to reveal the underlying causes. Direct access to a customer's production environment is usually not an option. To get better understanding of the underlying problem, instrumented debug patches are usually created with the hope that running the applications with debug patches would provide more insight. This can be a trial and error method and can take several iterations before hitting upon the actual cause. The folks creating a debug patch (typically Support or Development teams in the software provider organization) and the customers running the application are almost always different groups, often in different companies. Thus, each iteration of creating a debug patch, providing it to the customer, getting it applied in customer environment and getting the results back can take substantial time. In turn, it can result in delays in problem resolution.

In addition, there can be other significant issues with deploying such debug patches. Applying patches in a Java EE environment requires bouncing servers and domains or at least redeploying applications. In mission critical deployments, it may not be possible to immediately apply patches. Moreover, when a server is bounced, its state is lost. Thus, vital failure data in memory may be lost. Also, an intermittent failure may not show up for a long time after restarting servers, making quick diagnosis difficult.

Dynamic Debug Patches

In the WebLogic Server 12.2.1 release, a new feature called Dynamic Debug Patches is introduced which aims to simplify the process of capturing diagnostic data for quicker problem resolution. With this feature, debug patches can be dynamically activated without having to restart servers or clusters or redeploy applications in a WebLogic domain. It leverages the JDK's instrumentation feature to hot-swap classes from specified debug patches using run-time WLST commands. With provided WLST commands (as described below), one or more debug patches can be activated within the scope of selected servers, clusters, partitions and applications. Since no server restart  or application redeployment is needed, associated logistical impediments are a non-issue. For one, since the applications and services continue to run, there is less of a barrier to activate these patches in production environments. Also, there is no loss of state. Thus, the instrumented code in newly activated debug patches have a better chance at revealing erroneous transient state and providing meaningful diagnostic information.


Dynamic debug patches are ordinary jar files containing patched classes with additional instrumentation such as debug logging, print statements, etc. Typically, product development or support teams build these patch jars and make them available to system operations teams for their activation in the field. To make them available to the WebLogic Server's dynamic debug patches feature, system administrators need to copy them to a specific directory in a domain. By default, this directory is the debug_patches subdirectory under the domain root. However, it can be changed by reconfiguring the DebugPatchDirectory attribute of the DebugPatchesMBean.

Another requirement is to start the servers in the domain with the debugpatch instrumentation agent with the following option in the server's startup command. It is automatically added by the startup scripts created for WebLogic Server 12.2.1 domains.


Using Dynamic Debug Patches Feature

We will illustrate the use of this feature by activating and deactivating debug patches on a simple toy application.

The Application

We will use a minimalist toy web application which computes the factorial value of an input integer and returns it to the browser.

package example;

import javax.servlet.GenericServlet;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.annotation.WebServlet;

import java.util.Map;
import java.util.HashMap;
import java.util.concurrent.ConcurrentHashMap;

 * A trivial servlet: Returns addition of two numbers.
@WebServlet(value="/factorial", name="factorial-servlet")
public class FactorialServlet extends GenericServlet {

  public void service(ServletRequest request, ServletResponse response)
      throws ServletException, IOException {
    String n = request.getParameter("n");
    System.out.println("FactorialServlet called for input=" + n);
    int result = Factorial.getInstance().factorial(n);
    response.getWriter().print("factorial(" + n + ") = " + result);

The servlet delegates to the Factorial singleton to compute the factorial value. As an optimization, the Factorial class maintains a Map of previously computed values which serves as an illustration of retaining stateful information while activating or deactivating dynamic debug patches.

package example;

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

class Factorial {
  private static final Factorial SINGLETON = new Factorial();
  private Map<String, Integer> map = new ConcurrentHashMap<String, Integer>();

  static Factorial getInstance() {
    return SINGLETON;

  public int factorial(String n) {
    if (n == null) {
      throw new NumberFormatException("Invalid argument: " + n);
    n = n.trim();
    Integer val = map.get(n);
    if (val == null) {
      int i = Integer.parseInt(n);
      if (i < 0)
        throw new NumberFormatException("Invalid argument: " + n);
      int fact = 1;
      while (i > 0) {
        fact *= i;
      val = new Integer(fact);
      map.put(n, val);
    return val;

Building and Deploying the Application

To build the factorial.war web application, create and files as above in an empty directory. Build the application war files with following commands:

mkdir -p WEB-INF/classes
javac -d WEB-INF/classes
jar cvf factorial.war WEB-INF

Deploy the application using WLST (or WebLogic Server Administration Console):

Initializing WebLogic Scripting Tool (WLST) ...
Welcome to WebLogic Server Administration Scripting Shell
Type help() for help on available commands
connect(username, password, adminUrl)  # e.g. connect('weblogic', 'weblogic', 't3://localhost:7001')
Connecting to t3://localhost:7001 with userid weblogic ...
Successfully connected to Admin Server "myserver" that belongs to domain "mydomain".
Warning: An insecure protocol was used to connect to the server.
To ensure on-the-wire security, the SSL port or Admin port should be used instead.
deploy('factorial', 'factorial.war', targets='myserver')

Note that in the illustration above, we targeted the application only to the administration server. It may have been targeted to other managed servers or clusters in real world. We will discuss how to activate and deactivate debug patches over multiple managed servers and clusters in a subsequent article.

Invoke the web application from your browser. For example: http://localhost:7001/factorial/factorial?n=4 You should see the result in the browser and a message in the server's stdout window such as:

FactorialServlet called for input=4

The Debug Patch

The application as written does not perform lot of logging and does not reveal much about its functioning. Perhaps there is a problem and we need more information when it executes. We can create a debug patch from the application code and provide it to the system administrator so he/she can activate it on the running server/application. Let us modify above code to put additional print statements for getting additional information (i.e. the lines with "MYDEBUG" below).

Updated (version 1)

class Factorial {
  private static final Factorial SINGLETON = new Factorial();
  private Map<String, Integer> map = new ConcurrentHashMap<String, Integer>();
  static Factorial getInstance() {
    return SINGLETON;
  public int factorial(String n) {
    if (n == null) {
      throw new NumberFormatException("Invalid argument: " + n);
    n = n.trim();
    Integer val = map.get(n);
    if (val == null) {
      int i = Integer.parseInt(n);
      if (i < 0)
        throw new NumberFormatException("Invalid argument: " + n);
      int fact = 1;
      while (i > 0) {
        fact *= i;
      val = new Integer(fact);
      System.out.println("MYDEBUG> saving factorial(" + n + ") = " + val);
      map.put(n, val);
    } else {
      System.out.println("MYDEBUG> returning saved factorial(" + n + ") = " + val);
    return val;

Build the debug patch jar. Note that this a plain jar file, that is, not built as an application archive. Also note that we need not compile the entire application (although it would not hurt). The debug patch jar should contain only the classes which have changed (in this case, Factorial.class).

mkdir patch_classes
javac -d patch_classes
jar cvf factorial_debug_01.jar -C patch_classes

Activating Debug Patches

In most real world scenarios, creators (developers) and activators (system administraors) of debug patches would be different people. For the purpose of illustration, we will wear multiple hats here. Assuming that we are using the default configuration for the location of the debug patches directory, create the debug_patches directory under the domain directory if it is not already there. Copy factorial_debug_01.jar debug patch jar into the debug_patches directory.  Connect to the server with WLST as above.

First, let us check which debug patches are available in the domain. This can be done with the listDebugPatches command.

Hint: To see available diagnostics commands, issue help('diagnostics') command. To get information on specific command, issue help(commandName), e.g. help('activateDebugPatch').

wls:/mydomain/serverConfig/> listDebugPatches()         
Active Patches:
Available Patches:

factorial_debug_01.jar is the newly created debug patch. app2.0_patch01.jar and app2.0_patch02.jar were created in the past to investigate issues with some other application. The listing above shows no "active" patches since none have been activated so far.

Now, let us activate the debug patch with the activateDebugPatch command.

tasks=activateDebugPatch('factorial_debug_01.jar', app='factorial', target='myserver')
wls:/mydomain/serverConfig/> print tasks[0].status                                                                 
wls:/mydomain/serverConfig/> listDebugPatches()     
Active Patches:
Available Patches:

The command returns an array of tasks which can be used to monitor the progress and status of activation command. Multiple managed servers and/or clusters can be specified as targets if applicable. Corresponding to each applicable target server, there is a task in the returned tasks array. The command can also be used to activate debug patches at the server and middleware level as well. Such patches will be typically created by Oracle Support as needed. Output of listDebugPatches() command above shows that factorial_debug_01.jar is now activated on application "factorial".

Now, let us send some requests to the application: http://localhost:7001/factorial/factorial?n=4 and http://localhost:7001/factorial/factorial?n=5

Server output:

FactorialServlet called for input=4
MYDEBUG> returning saved factorial(4) = 24
FactorialServlet called for input=5
MYDEBUG> saving factorial(5) = 120

Notice that for input=4, saved results were returned since the values were computed and saved in the map due to a prior request. Thus, the debug patch was activated without destroying existing state in the application. For input=5, values were not previously computed and saved, thus a different debug message showed up.

Activating Multiple Debug Patches

If needed, multiple patches which potentially overlap can be activated. A patch which is activated later would mask the effects of a previously activated patch if there is an overlap. Say, in the above case, we need more detailed information from the factorial() method as it is executing its inner loop. Let us create another debug patch, copy it to debug_patches directory and activate it.

Updated (version 2)

class Factorial {
  private static final Factorial SINGLETON = new Factorial();
  private Map<String, Integer> map = new ConcurrentHashMap<String, Integer>();
  static Factorial getInstance() {
    return SINGLETON;
  public int factorial(String n) {
    if (n == null) {
      throw new NumberFormatException("Invalid argument: " + n);
    n = n.trim();
    Integer val = map.get(n);
    if (val == null) {
      int i = Integer.parseInt(n);
      if (i < 0)
        throw new NumberFormatException("Invalid argument: " + n);
      int fact = 1;
      while (i > 0) {
        System.out.println("MYDEBUG> multiplying by " + i);
        fact *= i;
      val = new Integer(fact);
      System.out.println("MYDEBUG> saving factorial(" + n + ") = " + val);
      map.put(n, val);
    } else {
      System.out.println("MYDEBUG> returning saved factorial(" + n + ") = " + val);
    return val;

Build factorial_debug_02.jar

javac -d patch_classes
jar cvf factorial_debug_02.jar  -C patch_classes .
cp factorial_debug_02.jar $DOMAIN_DIR/debug_patches

Activate factorial_debug_02.jar

wls:/mydomain/serverConfig/> listDebugPatches()     
Active Patches:
Available Patches:
wls:/mydomain/serverConfig/> tasks=activateDebugPatch('factorial_debug_01.jar', app='factorial', target='myserver')
wls:/mydomain/serverConfig/> listDebugPatches()                                                                    
Active Patches:
Available Patches:

Now, let us send some requests to the application: http://localhost:7001/factorial/factorial?n=5 and http://localhost:7001/factorial/factorial?n=6

FactorialServlet called for input=5
MYDEBUG> returning saved factorial(5) = 120
FactorialServlet called for input=6
MYDEBUG> multiplying by 6
MYDEBUG> multiplying by 5
MYDEBUG> multiplying by 4
MYDEBUG> multiplying by 3
MYDEBUG> multiplying by 2
MYDEBUG> multiplying by 1
MYDEBUG> saving factorial(6) = 720

We see the additional information printed due to code in factorial_debug_02.jar.

Deactivating Debug Patches

When the debug patch is not needed any more, it can be deactivated deactivateDebugPatches command. To get help on it, execute help('deactivateDebugPatches').

wls:/mydomain/serverConfig/> tasks=deactivateDebugPatches('factorial_debug_02.jar', app='factorial', target='myserver')            
wls:/mydomain/serverConfig/> listDebugPatches()                                                                        
Active Patches:
Available Patches:

Now, executing http://localhost:7001/factorial/factorial?n=2 gets us the following output in server's stdout window:

FactorialServlet called for input=2
MYDEBUG> saving factorial(2) = 2

Note that when we had activated factorial_debug_01.jar and factorial_debug_02.jar in that order, the classes in factorial_debug_02.jar masked those in factorial_debug_01.jar. After deactivating factorial_debug_02.jar, the classes in factorial_debug_01.jar got unmasked and became effective again. A list of comma separated list of debug patches may be specified with the deactivateDebugPatches command. To deactivate all active debug patches on applicable target servers, deactivateAllDebugPatches() command may be used.

WLST Commands

Following diagnostic WLST commands are provided to interact with the Dynamic Debug Patches feature. As noted above, help(command-name) shows help for that command.

 Command  Description
activateDebugPatch Activate a debug patch on specified targets.
deactivateAllDebugPatches De-activate all debug patches from specified targets.
deactivateDebugPatches De-activate debug patches on specified targets.
listDebugPatches List activated and available debug patches on specified targets.
listDebugPatchTasks List debug patch tasks from specified targets.
purgeDebugPatchTasks Purge debug patch tasks from specified targets.
showDebugPatchInfo  Show details about a debug patch on specified targets.


The Dynamic Debug Patch features leverages JDK's hot-swap feature. The hot-swap feature has a limitation that how-swapped classes cannot have a different shape than the original classes. This means that the classes which are swapped in cannot add, remove or update constructors, methods, fields, super classes, implemented interfaces, etc. Only changes in method bodies are allowed. It should be noted that debug patches typically only gather additional information and not attempt to "fix" the problems as such. Minor fixes which would not change the shape of classes may be tried, but that is not the main purpose of this feature. Therefore, we don't expect this to be a big limitation in practice.

One issue, however is that, in some cases the new debug code may need to maintain some state. For example, perhaps we want to collect some data in some map and only dump it out on some threshold. The JDK limitation regarding shape-change creates problems in that case. The Dynamic Debug Patches provides a DebugPatchHelper utility class to help address some of those concerns. We will discuss that in a subsequent article. Please visit us back to read about it.

Using Diagnostic Context for Correlation

The WebLogic Diagnostics Framework (WLDF) and Fusion Middleware Diagnostics Monitoring System (DMS) provide correlation information in diagnostic artifacts such as logs and Java Flight Recorder (JFR).

The correlation information flows along with a Request across threads within and between WebLogic server processes, and can also flow across process boundaries to/from other Oracle products (such as from OTD or to the Database). This correlation information is exposed in the form of unique IDs which can be used to identify and correlate the flow of a specific request through the system. This information can also provide details on the ordering of the flow as well.

The correlation IDs are described as follows:

  • DiagnosticContextID (DCID) and ExecutionContextID (ECID). This is the unique identifier which identifies the Request flowing through the system. While the name of the ID may be different depending on whether you are using WLDF or DMS, it is the same ID. I will be using the term ECID as that is the name used in the broader set of Oracle products.
  • Relationship ID (RID). This ID is used to describe where in the overall flow (or tree) the Request is currently at. The ID itself is an ordered set of numbers that describes the location of each task in the tree of tasks. The leading number is usually a zero. A leading number of 1 indicates that it has not been possible to track the location of the sub-task within the overall sub-task tree.

These correlation IDs have been around for quite a long time, what is new in 12.2.1 is that WLDF now picks up some capabilities from DMS (even when DMS is not present):

  1) The RelationshipID (RID) feature from DMS is now supported
  2) The ability to handle correlation information coming in over HTTP
  3) The ability to propagate correlation out over HTTP when using the WebLogic HTTP client
  4) The concept of a non-inheritable Context (not covered in this blog, may be the topic of another blog)

For this blog, we will walk through a simple contrived scenario to show how an administrator can make use of this correlation information to quickly find the data available related to a particular Request flow. This diagram shows the basic scenario:

Each arrow in the diagram shows where a Context propagation could occur, however in our example propagation occurs only where we have solid blue arrows. The reason for this is that in our example we are using a Browser client which does not supply a Context, so for our example case the first place where a Context is created is when MySimpleServlet is called. Note that a Context could propagate into MySimpleServlet if it is called by a clients capable of providing the Context (for example, a DMS enabled HTTP client, a 12.2.1+ WebLogic HTTP client, or OTD).

In our contrived applications, we have each level querying the value of the ECID/RID using the DiagnosticContextHelper API, and the servlet will report these values. A real application would not be doing this, this is just for our example purposes so our servlet can display them.

We also have the EJB hard-coded to throw an Exception if the servlet request was supplied with a query string. The application will log warnings when that is detected, the warning log messages will automatically get the ECID/RID values included in them. The application does not need to do anything special to get them.

The applications used here as well as well as some basic instructions are attached in

First we will show hitting our servlet with an URL that is not expected to fail (http://myhost:7003/MySimpleServlet/MySimpleServlet):

From the screen shot above we can see that all of the application components are reporting the same ECID (f7cf87c6-9ef3-42c8-80fa-e6007c56c21f-0000022f). We also can see that the RID being reported by each components here are different and show the relationship between each of the components:

Next we will show hitting our servlet with an URL that is expected to fail (http://myhost:7003/MySimpleServlet/MySimpleServlet?fail):

We see that the EJB reported that it failed. In our contrived example app, we can see that the ECID is for the entire flow where the failure occured was "f7cf87c6-9ef3-42c8-80fa-e6007c56c21f-00000231". In a real application, that would not be the case. An administrator would most likely first see warnings reported in the various server logs, and see the ECID reported with those warnings. Since we know the ECID in this case, we can "grep" for it to show what those warnings would look like and that they have ECID/RID reported in them:

Upon seeing that we had a failure, the admin will capture JFR data from all of the servers involved. In a real scenario, the admin may have noticed the warnings in the logs, or perhaps had a Policy/Action (formerly known as Watch/Notification) configured to automatically notify or capture data. For our simple example, a WLST script is included to capture the JFR data.

The assumption is that folks here are familiar with JFR and Java Mission Control (JMC), also that they have installed the WebLogic Plugin for JMC (video on installing the plugin)

Since we have an ECID in hand already related to the failure (in a real case this would be from the warnings in the logs), we will pull up the JFR data in JMC and go directly to the "ECIDs" tab in the "WebLogic" tab group. This tab initially shows us an unfiltered view from the AdminServer JFR, which includes all ECIDs present in that JFR recording:

Next we will copy/paste the ECID "f7cf87c6-9ef3-42c8-80fa-e6007c56c21f-00000231" into the "Filter Column" for "ECID":

With only the specific ECID displayed, we can select that and see the JFR events that are present in the JFR recording related to that ECID. We can right-click to add those associated events to the "operative set" feature in JMC. Once in the "operative set" other views in JMC can also be set to show only the operative set as well, see Using WLDF with Java Flight Recorder for more information.

Here we see screen shots showing the same filtered view for the ejbServer and webappServer JFR data:

In our simple contrived case, the failure we forced was entirely within application code. As a result, the JFR data we see here shows us the overall flow for example purposes, but it is not going to give us more insight into the failure in this specific case itself. In cases where something that is covered by JFR events caused a failure, it is a good way to see what failed and what happened leading up to the failure.

For more related information, see:

Wednesday Oct 28, 2015

Zero Downtime Patching Released!

Patching and Updating WebLogic servers just got a whole lot easier!  The release of Zero Downtime Patching marks a huge step forward in Oracle's commitment both to simplifying the maintenance of WebLogic servers, and to our ability to provide continuous availability.

Zero Downtime Patching allows you to rollout distributed patches to multiple clusters or to your entire domain with a single command. All without causing any service outages or loss of session data for the end-user. It takes what was once a tedious and time-consuming task and replaces it with a consistent, efficient, and resilient automated process.

By automating this process, we're able to drastically reduce the amount of human input required (errors), and we're able to verify the input that is given before making any changes. This will have a huge impact on the consistency and reliability of the process, and it will also greatly improve the effiency of the process.

The process is resilient in that it can retry steps when there are errors, it can pause for problem resolution and resume where it left off, or if desired, it can revert the entire environment back to its original state.

As an administrator, you create and verify a patched OracleHome archive with existing and familiar tools, and place the archive on each node that you want to upgrade. Then, a simple command like the one below will handle the rest.

rolloutOracleHome("Cluster1, Cluster2", "/pathTo/patchedOracleHome.jar", "/pathTo/backupOfUnpatchedOracleHome")

The way the process works is that we take advantage of existing clustering technology combined with an Oracle Traffic Director (OTD) load balancer, to allow us to take individual nodes offline one at a time to be updated. We communicate with the load balancer and instruct it to redirect requests to active nodes. We also created some advanced techiques for preserving active sessions so the end-user will never even know the patching is taking place.

We can leverage this same process for updating the Java version used by servers, and even for doing some upgrades to running applications, all without service downtime for the end-user.

There's a lot of exciting aspects to Zero Downtime (ZDT) Patching that we will be discussing here, so check back often!

For more information about Zero Downtime Patching, view the documentation.

WebLogic Server Multitenant Info Sources

In Will Lyons’s blog entry, he introduced the idea of WebLogic Server Multitenant, and there have been a few other blog entries related to WebLogic Server Multitenant since then. Besides these blogs and the product documentation, there are a couple of other things to take a look at:

I just posted a video on Youtube at This video includes a high-level introduction to WebLogic Server Multitenant. This video is a little bit longer than my other videos, but there are a lot of good things to talk about in WebLogic Server Multitenant.

We also have a datasheet at, which includes a fair amount of detail regarding the value and usefulness of WebLogic Server Multitenant.

I’m at OpenWorld this week where we are seeing a lot of interest in these new features. One aspect of value that seems to keep coming up is the value of running on a shared platform. There are cases where every time a new environment is added, it needs to be certified against security requirements or standard operating procedures. By sharing a platform, those certifications only need to be done once for the environment. New applications deployed in pluggable partitions would not need a ground-up certification. This can mean faster roll-out times/faster time to market and reduced costs.

 That’s all for now. Keep your eye on this blog. More info coming soon!

Weblogic Server 12.2.1 Multi-Tenancy Diagnostics Overview


WebLogic Server 12.2.1 release includes support for multitenancy, which allows multiple tenants to share a single WebLogic domain. Tenants have access to domain partitions which provides an isolated slice of the WebLogic domain's configuration and runtime infrastructure.

This blog provides an overview of the diagnostics and monitoring capabilities available to tenants for applications and resources deployed to their respective partitions.

These features are provided by the WebLogic Server Diagnostic Framework (WLDF) component.

The following topics are discussed in the sections below.

Log and Diagnostic Data

Log and diagnostic data from different sources are made available to the partition administrators. They are broadly classified into the following groups:

  1. Shared data - Log and diagnostic data not directly available to the partition administrators in raw persisted form. It is only available through the WLDF Accessor component.
  2. Partition scoped data - These logs are available to the partition administrators in its raw form under the partition file system directory.

Note that The WLDF Data Accessor component provides access to both the shared and partition scoped log and diagnostic data available on a WebLogic Server instance for a partition.

The following shared logs and diagnostic data is available to a partition administrator.

Log Type
Content Description
Server Log events from Server and Application components pertaining to the partition recorded in the Server log file.
Domain Log events collected centrally from all the Server instances in the WebLogic domain pertaining to the partition in a single log file.
DataSource DataSource log events pertaining to the partition.
HarvestedData Archive Metrics data gathered by the WLDF Harvester from MBeans pertaining to the partition.
Instrumentation Events Archive WLDF Instrumentation events generated by applications deployed to the partition.

The following partition scoped log and diagnostic data is available to a partition administrator.

Log Type
Content Description
HTPP access.log HTTP access.log from partition virtual target's WebServer
JMSServer JMS server message life-cycle events for JMS server resources defined within a resource group or resource group template scoped to a partition.
SAF Agent SAF agent message life-cycle events for SAF agent resources defined within a resource group or resource group template scoped to a partition.
Connector Log data generated by Java EE resource adapter modules deployed to a resource group or resource group template within a partition.
Servlet Context Servlet context log data generated by Java EE web application modules deployed to a resource group or resource group template within a partition.

WLDF Accessor

The WLDF Accessor provides the RuntimeMBean interface to retrieve diagnostic data over JMX. It also provides a query capability to fetch only a subset of the data.

Please refer to the documentation on WLDF Data Accessor for WebLogic Server for a detailed description of this functionality.

WLDFPartitionRuntimeMBean (child of PartitionRuntimeMBean) is the root of the WLDF Runtime MBeans. It provides a getter for WLDFPartitionAccessRuntimeMBean interface which is the entry point for the WLDF Accessor functionality scoped to a partition. There is an instance of WLDFDataAccessRuntimeMBean for each log instance available for partitions.

Different logs are referred to by their logical names according to a predefined naming scheme.

The following table lists the logical name patterns for the different partition scoped logs.

Shared Logs

Log Type

Logical Name

Server Log


Domain Log




Harvested Metrics


Instrumentation Events


Partition Scoped Logs

Log Type

Logical Name

HTTP Access Log


JMS Server Log


SAF Agent Log


Servlet Context Log


Connector Log


Logging Configuration

WebLogic Server MT supports configuring Level for java.util.logging.Loggers used by application components running within a partition. This will allow Java EE applications using java.util.logging to be able to configure levels for their respective loggers even though they do not have access to the system level java.util.logging configuration mechanism. In case of shared logger instances used by common libraries used across partitions also the level configuration is applied to a Logger instance if it is doing work on behalf of a particular partition.

This feature is available if the WebLogic System Administrator has started the server with -Djava.util.logging.manager=weblogic.logging.WLLogManager command line system property.

If the WebLogic Server was started with the custom log manager as described above, the partition administrator can configure logger levels as follows:

Please refer to the sample WLST script in the WLS-MT documentation.

Note that the level configuration specified in the  PartitionLogMBean.PlatformLoggerLevels attribute is applicable only for the owning partition. It is possible that a logger instance with the same name is used by another partition, each logger's effective level at runtime will defined by the respective partition's  PartitionLogMBean.PlatformLoggerLevels configuration.

Server Debug 

For certain troubleshooting scenarios you may need to enable debug output from WebLogic Server subsystems specific to your partition. The Server debug output is useful to debug internal Server code when it is doing work on behalf of a partition. This needs to be done carefully in collaboration with the WebLogic System Administrator and Oracle Support. The WebLogic System Administrator must enable the ServerDebugMBean.PartitionDebugLoggingEnabled attribute first and will advise you to enable certain debug flags. These flags are boolean attributes defined on the ServerDebugMBean configuration interface. The specific debug flags to be enabled for a partition are configured via the PartitionLogMBean.EnabledServerDebugAttributes attribute. It contains an array of String values that are the names of the specific debug outputs to be enabled for the partition. The debug output thus produced is recorded in the server log from where it can be retrieved via the WLDF Accessor and provided to Oracle Support for further analysis. Note that once the troubleshooting is done the debug needs to be disabled as there is a performance overhead incurred when you enable server debug.

Please refer to the sample WLST script in the WebLogic Server MT documentation on how to enable partition specific server debug.

Diagnostic System Module for Partitions

Diagnostic System Module provides Harvester and Policies and Actions components that can be defined within a resource group or resource group template deployed to a Partition.


WLDF Harvester provides the capability to poll MBean metric values periodically and and archive the data in the harvested data archive for later diagnosis and analysis. All WebLogic Server Runtime MBeans visible to the partition including the PartitionRuntimeMBean and its child MBeans as well as custom MBeans created by applications deployed to the partition are allowed for harvesting. Harvester configuration defines the sampling period, the MBean types and instance specification and their respective MBean attributes that needs to be collected and persisted.

Note that the archived harvested metrics data is available from the WLDF Accessor component as described earlier.

The following is an example of harvester configuration persisted in the Diagnostic System Resource XML descriptor.


For further details refer to the WLDF Harvester documentation.

Policies and Actions

Policies are rules that are defined in Java Expression Language (EL) for conditions that need to be monitored. WLDF provides a rich set of actions that can be attached to policies that get triggered if the rule condition is satisfied. 

The following types of rule based policies can be defined.

  • Harvester - Based on WebLogic Runtime MBean or Application owned custom MBean metrics.
  • Log events - Log messages in the server and domain logs.
  • Instrumentation Events - Events generated from Java EE application instrumented code using WLDF Instrumentation.

The following snippets show the configuration of the policies using the EL language.

    <rule-expression>wls.partition.query("com.bea:Type=WebAppComponentRuntime,*", "OpenSessionsCurrentCount").stream().anyMatch(x -> x >= 1)
  <rule-expression>log.severityString == 'Error'</rule-expression>
  <rule-expression>instrumentationEvent.eventType == 'TraceAction'</rule-expression>

The following types of actions are supported for partitions:

  • JMS
  • SMTP
  • JMX
  • REST
  • Diagnostic Image

For further details refer to the Configuring Policies and Actions documentation.

Instrumentation for Partition Applications

WLDF provides a byte code instrumentation mechanism for Java EE applications deployed within a partition scope. The Instrumentation configuration for the application is specified in the META-INF/weblogic-diagnostics.xml descriptor file.  

This feature is available only if the WebLogic System Administrator has enabled server level instrumentation. Also it is not available for applications that share class loaders across partitions.

The following shows an example WLDF Instrumentation descriptor.

    <pointcut>execution( * example.util.MyUtil * (...))</pointcut>

For further details refer to the WLDF Instrumentation documentation.

Diagnostic Image

The Diagnostic Image is similar to a core dump which captures the state of the different WebLogic Server subsystems in a single image zip file. WLDF supports the capturing of partition specific diagnostic images.

Diagnostic images can be captured in the following ways:

  • From WLST by the partition administrator.
  • As the configured action for a WLDF policy.
  • By invoking the captureImage() operation on the WLDFPartitionImageRuntimeMBean.

Images are output to the logs/diagnostic_images directory in the partition file system.

The image for a partition contains diagnostic data from different sources such as:

  • Connector
  • Instrumentation
  • JDBC
  • JNDI
  • JVM
  • Logging
  • RCM
  • Work Manager
  • JTA

For further details refer to the WLDF documentation.

RCM Runtime Metrics

WebLogic Server 12.2.1 introduced Resource Consumption Management (RCM) feature.  This feature is only available in Oracle JDK JDK8u40 and above.

To enable RCM add the following command line switches on Server start up

-XX:+UnlockCommercialFeatures -XX:+ResourceManagement -XX:+UseG1GC
Please note that RCM is not enabled by default in the startup scripts.
The PartitionResourceMetricsRuntimeMBean which is a child of the PartitionRuntimeMBean provides a bunch of useful metrics for monitoring purposes.

Attribute Getter



Checks whether RCM metrics data is available for this partition.


Total CPU time spent measured in nanoseconds in the context of a partition.


Total allocated memory in bytes for the partition.This metric value increases monotonically over time.


Number of threads currently assigned to the partition.



Total  and current number of sockets opened in the context of a partition.



Total number of bytes read /written from sockets for a partition.



Total  and current number of files opened in the context of a partition.



Total number of file bytes  read/written in the context of a partition.



Total  and current number of file descriptors opened in the context of a partition.


Returns a snapshot of the historical data for retained heap memory usage for the partition.  Data is returned as a two-dimensional array for the usage of retained heap scoped to the partition over time.  Each item in the array contains a tuple of [timestamp (long), retainedHeap(long)] values.


Returns a snapshot of the historical data for CPU usage for the partition. CPU utilization percentage indicates the percentage of CPU utilized by a partition with respect to available CPU to Weblogic Server.

Data is returned as a two-dimensional array for the CPU usage scoped to the partition over time. Each item in the array contains a tuple of [timestamp (long), cpuUsage(long)] values.

Please note that the PartitionMBean.RCMHistoricalDataBufferLimit attribute limits the size of the data arrays for Heap and CPU.

Java Flight Recorder

WLDF provides integration with the Java Flight Recorder which enables WebLogic Server events to be included in the JVM recording. WebLogic Server events generated in the context of work being done on behalf of the partition are tagged with the partition-id and partition-name. These events and the flight recording data are only available to the Weblogic System Administrator.


WLDF provides a rich set of tools to capture and access to different types of monitoring data that will be very useful for troubleshooting and diagnosis tasks. This blog provided an introduction to the WLDF surface area for partition administrators. You are encouraged to take a deeper dive and explore these features further and leverage them in your production environments. More detail information is available in the WLDF documentation for WebLogic Server and the Partition Monitoring section in the WebLogic Server MT documentation.

Partition Import/Export

This article will discuss common use cases scenarios of Import/Export Partition in Weblogic Multitenant Edition
[Read More]

Tuesday Oct 27, 2015

Resource Consumption Management in WebLogic Server MultiTenant 12.2.1 to Control Resource Usage of Domain Partitions

[This blog post is part of a series of posts that introduce you to new features in the recently announced Oracle WebLogic Server 12.2.1, and introduces an exciting performance isolation feature that is part of it.] 

With the increasing push to "doing more with less" in the enterprise, system administrators and deployers are constantly looking to increase density and improve hardware utilization for their enterprise deployments. The support for micro-containers/pluggable Domain Partitions in WebLogic Server Multitenant helps system administrators collocate their existing silo-ed business critical Java EE deployments into a single Mutitenant domain.

Say, a system administrator creates two Partitions "Red" and "Blue" in a shared JVM (a WebLogic Multitenant Server instance), and deploys Java EE applications and resources to them. A system administrator would like to avoid the situation where one partition's applications (say the "Blue" partition) "hogs" all shared resources in the Server instance's JVM (Heap)/the operating system (CPU, File descriptors), and negatively affecting the "Red" partition applications' access to these resources.

Runtime Isolation

Therefore, while consolidating existing enterprise workloads into a single Multitenant Server instance, system administrators would require better control (track, manage, monitor, control) over usage of shared resources by collocated Domain Partitions so that:


  • One Partition doesn't consume all available resources, and exhaust them from other collocated partitions. This helps a system administrator plan for, and support consistent performance to all collocated partitions.
  • Fair and effecient allocation of available resources are provided to collocated partitions. This helps a system administrator confidently place complementary workloads in the same environment, while achieving enhanced density and great cost-savings.

Control Resource Consumption Management


In Fusion Middleware 12.2.1, Oracle WebLogic Server Multitenant supports establishing resource management policies on the following resources


  • Heap Retained: Track and control the amount of Heap retained by a Partition
  • CPU Utilization: Track and control the amount of CPU utilization used by a Partition
  • Open File Descriptors: Track and control the amount of open file descriptors (due to File I/O, Sockets etc) used by a Partition.

    Recourse Actions

    When a trigger is breached, a system administrator may want to react to that by automatically taking certain recourse actions in response. The following actions are available out of the box with WebLogic.


    • Notify: inform administrator that a threshold has been surpassed
    • Slow: reduce partition’s ability to consume resources, predominantly through manipulation of work manager settings – should cause system to self-correct in certain situations
    • Fail: reject requests for the resource, i.e. throw an exception - only supported for file descriptors today
    • Stop: As an extreme step, initiate the shut down sequence for the offending partition on the current server instance


      The Resource Consumption Management feature in Oracle WebLogic Server Multitenant, enables a system administrator to specify resource consumption management policies on resources, and direct WebLogic to automatically take specific recourse actions when the policies are violated. A policy could either be created as one of the following two types


      • Trigger: This is useful when resource usage by Partitions are predictable and takes the form "when a resource's usage by a Partition crosses a Threshold, take a recourse action". 


      For example, a sample resource consumption policy that a system administrator may establish on a "Blue" Partition to ensure that it doesn't run away with all the Heap looks like: When the “Retained Heap” (Resource) usage for the “Blue” (Partition) crosses “2 GB” (Trigger), “stop” (Action) the partition.


      • Fair share: Similar to the Work Manager fair share policy in WebLogic, this policy allows a system administrator to specify "shares" of a bounded-size shared resource to a Partition. WebLogic then ensures that this resource is shared effectively (yet fairly) by competing consumers while honouring the "shares" allocated by the system administrator. 


      For example, a sample resource consumption policy that a system administrator who prefers "Red" partition over "Blue" may set the fair-share for the "CPU Utilization" resource in the ration 60:40 in favour of "Red".

      When complementary workloads are deployed to collocated partitions, fair-share policies also helps achieving maximal utilization of resources. For instance, when there are no or limited requests for the "Blue" partition, the "Red" partition would be allowed to "steal" and use all the available CPU time. When traffic resumes on the "Blue" partition and there is contention for CPU, WebLogic would allocate CPU time as per the fair-share ratio set by the system administrator. This helps system administrators reuse a single shared infrastructure and saving infrastructure costs in turn, while still retaining control over how those resources are allocated to Partitions. 


      Policy configurations could be defined at the domain level and reused across multiple pluggable Partitions, or they can be defined unique to a Partition. Policy configurations are flexible to support different combinations of trigger-based and fair-share policies for multiple resources to meet your unique business requirements. Policies can also be dynamically reconfigured without any restart of the Partition required. 

      The picture below shows how a system administrator could configure two resource consumption management policies (a stricter "trial" policy and a lax "approved" policy) and how they could be assigned to individual Domain Partitions. Heap and CPU resource by the two domain Partitions are then governed by the policies associated with each of them.

      WLS 12.2.1 RCM resource manager sample schematic

      Enabling Resource Management

      The Resource Consumption Management feature in WebLogic Server 12.2.1 is built on top of the resource management support in Oracle JDK 8u40. WebLogic RCM requires Oracle JDK 8u40 and the G1 Garbage Collector. In WebLogic Server Multitenant, you would need to pass the following additional JVM arguments to enable Resource Management:

      “-XX:+UnlockCommercialFeatures -XX:+ResourceManagement -XX:+UseG1GC”

      Track Resource Consumption

      Resource consumption metrics are also available on a per partition basis, and is provided through a Monitoring MBean, PartitionResourceMetricsRuntimeMBean. Detailed usage metrics on a per partition basis is available through this monitoring Mbean, and system administrators may use these metrics for the purposes of tracking, sizing, analysis, monitoring, and for configuring business-specific Watch and Harvester WLDF rules.


      Resource Consumption Managers in WebLogic MultiTenant helps provide the runtime isolation and protection needed for applications running in your shared and consolidated environments.

      For More Information

      This blog post only scratches the surface of the possibilities with the Resource Consumption Management feature. For more details on this feature, and how you can configure resource consumption management policies in a consolidated Multitenant domain using Weblogic Scripting Tool (WLST) and Fusion Middleware Control, and best practices, please refer the detailed technical document at "Resource Consumption Management (RCM) in Oracle WebLogic Server Multitenant (MT) - Flexibility and Control Over Resource Usage in Consolidated Environments".

      The Weblogic MultiTenant documentation's chapter "Configuring Resource Consumption Management" also has more details on using the feature.

      This feature is a result of deep integration between the Oracle JDK and WebLogic Server. If you are attending Oracle OpenWorld 2015 in San Francisco, head over to the session titled "Multitenancy in Java: Innovation in the JDK and Oracle WebLogic Server 12.2.1" [CON8633] (Wednesday, Oct 28, 1:45 p.m. | Moscone South—302) to hear us talk about this feature in more detail.

      We are also planning a series of videos on using the feature and we will update this blog entry as they become available.

      [Read More]

      Data Source System Property Enhancement in WLS 12.2.1

      The earlier blog at Setting V$SESSION for a WLS Datasource described using system properties to set driver connection properties, which in turn automatically set values on the Oracle database session. Some comments from readers indicated that there were some limitations to this mechanism.

      - There are some values that can’t be set on the command line because they aren’t available until the application server starts. The most obvious value is the process identifier.

      - Values set on the command line imply that they are valid for all environments in the server which is fine values like the program name but not appropriate for datasource-specific values or the new partition name that is available with the WLS Multi Tenancy feature.

      - In a recent application that I was working with, it was desirable to connect to the server hosting the datasource that was connected to the session so that we could run a graceful shutdown In this case, additional information was needed to generate the URL.

      All of these cases are handled with the enhanced system properties feature.

      The original feature supported setting driver properties using the value of system properties. The new feature is overloaded on top of the old feature to avoid introducing yet another set of driver properties in the graphical user interfaces and WLST scripts. It is enabled by specifying one or more of the supported variables listed in the table below into the string value. If one or more of these variables is included in the system property, it is substituted with the corresponding value. If a value for the variable is not found, no substitution is performed. If none of these variables are found in the system property, then the value is taken as a system property name.


      Value Description


      First half (up to @) of ManagementFactory.getRuntimeMXBean().getName()


      Second half of ManagementFactory.getRuntimeMXBean().getName()


      Java system property


      System property


      Data source name from the JDBC descriptor. It does not contain the partition name.


      Partition name or DOMAIN


      WebLogic Server server listen port


      WebLogic Server server SSL listen port


      WebLogic Server server name


      WebLogic Server domain name

      A sample property is shown in the following example:




      <sys-prop-value>WebLogic ${servername} Partition ${partition}</sys-prop-value>



      In this example, v$session.program running on myserver is set to “WebLogic myserver Partition DOMAIN”.

      The biggest limitation of this feature is the character limit on the associated columns in v$session. If you exceed the limit, connection creation will fail.

      Using this enhancement combined with the Oracle v$session values can make this a powerful feature for tracking information about the source of the connections.

      See the  Blog announcing Oracle WebLogic Server 12.2.1 for more details on Multenacy and other new features.

      Domain Partitions for Multi-tenancy in WebLogic Server 12.2.1

      One of the signature enhancements in WebLogic Server 12.2.1 is support for multi-tenancy. Some of the key additions to WebLogic for multi-tenancy are in the configuration and runtime areas where we have added domain partitions, resource groups, and resource group templates. This post gives you an overview of what these are and how they fit together.

      Domain Partition

      A domain partition (partition for short) is an administrative and runtime slice of a WebLogic domain. In many ways you can think of a partition as a WebLogic micro-container.

      In 12.1.3 and before, you define managed servers and security realms in the domain and you deploy apps and resources to the domain. You target the apps and resources to managed servers and clusters to control where they will run.

      And in 12.2.1 you can still do all that.

      But you can also create one or more partitions in the domain. Each partition will contain its own apps and resources. 

      You can target each partition to a managed server cluster where you want the partition’s apps and resources to run. (More about targeting later.)

      You can start and shut down each partition independently, even if you have targeted multiple partitions to the same cluster. You can have different partitions use different security realms.

      Plus, for each partition you can identify partition administrators who can control that partition. You can begin to see how a partition is a micro-container — a slice of the domain which contains it.

      Resource Group and PaaS

      Suppose you have a set of Java EE applications and supporting resources that are a human resources solution, and you have another set of apps and their resources that are a finance package. In 12.1.3 you would deploy all the apps and all the resources to the domain. You might target them to different clusters — in fact you would have to do that if you wanted to start up and shut down the two packages independently.

      Or, in 12.1.3 you could use separate WebLogic domains — one for the HR apps and one for the finance ones. That would give you more control over targeting and starting and shutting down, but at the cost of running more domains.

      With 12.2.1 we introduce the resource group which is simply a collection of related Java EE apps and resources. You can gather the HR apps and resources into one resource group and collect the finance apps and resources into another resource group.

      Things get even more interesting because each partition can contain one or more resource groups. 

      If you were thinking about running two separate domains before — one domain for HR and one for finance — you could instead use one domain containing two partitions — one partition for HR and one for finance. With our simple example the HR partition would have a resource group containing the HR apps and resources, and the finance partition would have a resource group containing the finance apps and resources.

      In 12.2.1 you actually target each resource group. If it makes sense, you could target the resource groups in both the HR and finance partitions to the same cluster. And, because you can control the partitions independently, you can start or shut down either one without disturbing the other. The managed servers in the cluster stay up throughout. When you start up or shut down a partition, it is the apps and resources in the partition’s resource groups that are started up or shut down, not the entire server.

      People often call this the consolidation use case — you are consolidating multiple domains into one by mapping separate domains to separate partitions in a single consolidated domain. This is also called the platform-as-a-service (PAAS) use case. Each partition is used as a WebLogic “platform” (a micro-container) and you deploy different apps and resources to each.

      Resource Group Template and SaaS

      There is an entirely different way to use partitions and resource groups to solve different sorts of problems, but we need one more concept to do it well.

      Suppose you wanted to offer those HR and finance suites of apps, running in your data center, as services to other enterprises.

      In 12.1.3 you might create a separate domain for each client and deploy the HR and finance apps and resources the same way in each domain. You might use a domain template to simplify your job, but you’d still have the overhead of multiple domains. And one domain template might not work too well if one customer subscribed to your HR service but another subscribed to HR and finance.

      In 12.2.1 terms this sounds like two resource groups — the HR resource group and the finance resource group — but each running once for each of your enterprise customers. Using one partition per client sounds just right, but you would not want to define the two resource group repetitively and identically in each customer's partition.

      WebLogic 12.2.1 introduces the resource group template for just such a situation.

      You define the HR apps and resources in a resource group template within the WebLogic domain. Do the same for the finance apps and resources in another resource group template. Create a partition for each of your enterprise customers, and as before, in each partition, you create a resource group. 

      But now the resource group does not itself define the apps and resources but instead ">references the HR resource group template. And, if one customer wants to use both suites you create a second resource group in just that customer's partition, linked to the second resource group template.

      When you start the partition corresponding to one of your customers, WebLogic essentially starts up that customer's copies of the apps and resources as defined in the resource group template. When you start another client’s partition, WebLogic starts that customer's copies of the apps and resources.

      This is the classic software-as-a-service (SAAS) use case. And, if you replace the word “customer” with the word “tenant” in this description you see right away how simply WebLogic 12.2.1 supports multi-tenancy through partitions, resource groups, and resource group templates.

      There are other ways to use resource group templates besides offering packaged apps as a service for sale to others, but this example helps to illustrate the basics of how you can use these new features in WebLogic Server together to solve problems in whole new ways.

      Some Details

      The multi-tenancy support in WebLogic 12.2.1 is very rich. We’ve just scratched the surface here in this posting, and as you might expect there are other related features that make this all work in practice. This posting is not the place to cover all those details, but there are a few things I want to mention briefly.

      Resource Overriding

      About the SAAS use case you might be thinking “But I do not want each tenant’s partition configured exactly the same way as set up in the resource group template. For example, different tenants need to use different databases, so the JDBC connection information has to be different for different partitions.”

      WebLogic 12.2.1 lets you create overrides of the settings for apps and resources that are defined in a resource group template, and you can do so differently for each partition. For common cases (such as JDBC connection information) these overrides expose the key attributes you might need to adjust. For each partition and for each resource within the partition you can set up a separate override.

      We intend these overrides to cover most of the cases, but if you need even more control you can create — again, separately for each partition — a resource deployment plan. If you are familiar with application deployment plans in WebLogic the idea is very much the same, except applied to the non-app resources defined in resource group templates.

      To illustrate, here is what WebLogic does  conceptually, at least  when you start a partition containing resource groups linked to resource group templates:

      • The system reads all the resource settings from the resource group template.
      • If there is a resource deployment plan for the partition, the system applies any adjustments in the plan to the resource settings.
      • Finally, if you have created any overrides for that partition, the system applies them.
      • WebLogic uses the resulting resource settings to create that partition’s copies of the resources defined by the template.


      In this post I’ve mentioned targeting but WebLogic 12.2.1 lets you set up targeting of partitions and resource groups ranging from simple to very sophisticated (if you need that). My colleague Joe DiPol has published a separate posting about targeting.

      What Next?

      Here is the WebLogic Server 12.2.1 documentation that describes all the new features, and this part specifically describes the new multi-tenancy features. 

      Remember to check out Joe DiPol's posting about targeting

      For even more about the release of WebLogic Server please browse other postings on the WebLogic Server blog

      Monday Oct 26, 2015

      WLS Data Source Multitenancy

      See the  Blog announcing Oracle WebLogic Server 12.2.1 for more details on Multenacy and other new features.

      The largest and most innovative feature in WebLogic Server (WLS) 12.2.1 is Multitenancy.  It is integrated with every component in the application server.  As part of the Multi-tenancy effort one of the key concepts being introduced is the notion of a slice of the domain which is referred to as a Partition or Domain Partition.  A Partition defines applications and resources for a specific Tenant where each Partition's configuration and runtime are isolated from other Partitions in the Domain.  Multi-tenancy is expected to reduce administrative overhead associated with managing multiple domains and application deployments, and to improve the density of these deployments, such that operational and infrastructure costs are reduced.

      The concepts of the WLS MT feature are described WebLogic Server Multitenant (MT).  The details for MT data source are in the Configuring JDBC  chapter.  This article summarizes the use of data sources in a MT environment and focuses on finding your way around the administrative console and Fusion Middleware Control.

      When working without the WLS Multi Tenant feature, a data source (DS) may be defined as a system resource or deployed at the domain level. When using the Multi Tenant feature, a data source may also be defined in the following scopes.

      • Domain
        • DS with global scope
        • Domain-level Resource Group with DS with global scope
        • Domain-level Resource Group Template with DS
        • Partition
          • Partition-level Resource Group with DS
          • Partition-level Resource Group based on Resource Group Template
          • Partition-level JDBC System Resource Override
          • Partition-level Resource Deployment Plan
          • Object deployed at the partition level

      The following table summarizes the various deployment types and the mechanism to update or override the data source definition.

      Data Source Deployment Type

      Parameter Override Support

      Domain-level System Resource, optionally scoped in RG

      No override support – change the DS directly.

      RGT System Resource

      Change the DS directly or override in the RG derived from the RGT

      Partition-level System Resource in RG

      No override support – change the DS directly.

      Partition-level System Resource in a RG based on RGT

      JDBC System Override or Resource deployment plan.

      Application Scoped/Packaged Data Source deployed to domain or partition

      Application Deployment plan.

      Standalone Data Source Module deployed to domain or partition

      Application Deployment plan.

      Data Source Definitions (Java EE 6) deployed to domain or partition

      No override support.

      Creating a data source that is scoped to a Domain-level RG or in a Partition is similar to creating a domain-level system resource. The only additional step is to specify the scope. In the administration console or Fusion Middleware Control (FMWC), there is a drop-down on the first step of creation that lists the available scopes in which to create a data source. In WLST, it’s necessary to create the data source using createJDBCSystemResource on the owner MBean (the MBean for the domain, RG, or RGT).

      The WLST example at Configuring JDBC Data Sources: WLST Example is very useful in setting up a partitioned domain. It demonstrates creating a virtual target, partition, RGT, RG, and data sources at all levels.

      The remainder of this article focuses on the graphical user interfaces. 

      In the administration console, start by selecting the Data Source summary from the Home page. In this first figure, we see the four data sources that were created by running the WLST script.  One is global and the remaining three have various scopes within the partition. 

      If we click on the “ds-using-template” data source and look at the connection pool properties, we see the original values for the data source based on the template.  The overrides don’t show up at this level.

      Selecting New on the Data Source Summary page and creating a Generic Data Source, we can see the drop-down Scope on the first page.  It changes based on the scopes that are currently available.

      Back on the Home page, we can select Domain Partitions and we see the one “partition1” partition with two resource groups.

      If we click on “partition1” and go to the overrides page, we can see the JDBC override for “ds-in-template”. Note that the URL has now been overridden from using “otrade” to “otrade2”.

      Clicking on the “ds-in-template’ link allows for changing the overrides. Note that on this page, we can also see that the user has been overridden to be “scott”.

      You can create a new JDBC System Resource Override by selecting New. The drop-down box for Data Source lists the available resources for creating the override. The administration console currently lists all RG in the partition. However, the intent is that only RG derived from RGT should be allowed to have an override. Non-derived RG should be updated directly so it’s recommended not to override such groups (it may be removed in the future).

      Going back to the Home page, Data Sources, select a Data Source, select Security, select Credential Mappings, then New, we can enter new User, Remote User, Remote Password triplets

      It’s possible to look at lists of data sources at various levels. From the domain level, the monitoring tab on the Data Sources page shows all running data sources in all scopes.

      From the Partition page selecting “partition1”, select Resource Group, select “partition1-rg”, select Services, select JDBC, we see the one data source defined in this scope.

      Partition-scoped deployments are handled as with non-partition scoped deployments, you start by selecting Deployments from the Home page and then find the ear or war file that you want to deploy. On the first page of the “Install Application Assistant”, you can select the Scope as shown in the following figure.

      Once you finish deploying the ear or war file, you can see the associated modules in the console, with the associated scope by clicking on the associated link. Initially there is no deployment plan.

      Creating an application deployment plan is a bit complex.  It's recommended to use the administration console to create it automatically.  Simply go to the deployed data source, change the configuration, and save the changes.  The console creates the associated deployment plan and fills in the Deployment Plan name.

      If you want to override attributes of a RGT-derived partition datasource that are not one of user, password, or URL, you will need to create a Resource Deployment Plan. There is no automation in the console for doing this. You can massage an Application Deployment Plan to look like a Resource Deployment Plan or create one from scratch using an XML editor. Here is an equivalent Resource Deployment Plan.  To use the resource deployment plan, go to Home, Domain Partitions, click on the Partition link, and type in the pathname in the Resource Deployment Plan Path.

      The capabilities in FMWC are similar to the administrative console but with a different look and feel. FMWC currently does not have security pages; data source security configuration must be done in the administration console.

      If you select JDBC Data Sources from the WebLogic Domain drop-down, you will see something like this.

      Selecting Create brings up a page that includes a Scope drop-down.

      Selecting a resource group name brings up a page to edit the RG.

      Selecting an existing DS brings up a page to edit the DS.

      Selecting a partition name brings up a page to edit the Partition attributes.

      If you are looking for the Partition system resource overrides, select the partition name, select the Domain Partition drop-down, select Administration, then select Resource Overrides.

      The page looks like this.

      This page will list every RG that is derived from a RGT. If there is no existing override, the “Has Overrides” column will not have a check and clicking on “Edit Overrides” will create a new override. If “Has Overrides” has a check, clicking on “Edit Overrides” will update the existing overrides, as in the following figure.

      The focus of this article has been on configuration of data sources in a Multi Tenant environment.  It is the intent that using an application in a Multi Tenant environment should be largely transparent to the application software.  The critical part is to configure the application server objects and containers to be deployed in the partitions and resource groups where they are needed.

      Announcing Oracle WebLogic Server 12.2.1

      Oracle WebLogic Server 12.2.1 (12cR2) is now generally available.  The release was highlighted in Larry Ellison's Oracle OpenWorld keynote Sunday night, and was formally announced today in Inderjeet Singh's Oracle OpenWorld General Session and in a press release.  Oracle WebLogic Server 12.2.1 is available for download on the Oracle Technology Network (OTN) at the Oracle Fusion Middleware Download page, and the Oracle WebLogic Server Download page.   The product documentation is posted along with all documentation for Oracle Fusion Middleware 12.2.1, which has also been made available.   

      Oracle WebLogic Server 12.2.1 is the biggest WebLogic Server product release in many years.    We intend to provide extensive detail about its new features in videos and blogs posted here over the coming weeks.   Consider this note an initial summary on some of the major new feature areas:

      • Multitenancy
      • Continuous Availability
      • Developer Productivity and Portability to the Cloud


      WebLogic Server multitenancy enables users to consolidate applications and reduce cost of ownership, while maintaining application isolation and increasing flexibility and agility.   Using multitenancy, different applications (or tenants) can share the same server JVM (or cluster), and the same WebLogic domain, while maintaining isolation from each other by running in separate domain partitions.  

      A domain partition is an isolated subset of a WebLogic Server domain, and its servers.  Domain partitions act like microcontainers, encapsulating applications and the resources (datasources, JMS servers, etc) they depend on.  Partitions are isolated from each other, so that applications in one partition do not disrupt applications running in other partitions in the same server or domain. An amazing set of features provide these degrees of isolation.   We will elaborate on them here over time.

      Though partitions are isolated, they share many resources - the physical system they run on or the VM, the operating system, the JVM, and WebLogic Server libraries.  Because they share resources, they use fewer resources.   Customers can consolidate applications from separate domains into fewer domains, running in fewer JVMs in fewer physical systems.  Consolidation means fewer entities to manage, reduced resource consumption, and lower cost of ownership.

      Partitions are easy to use.  Applications can be deployed to partitions without changes, and we will be providing tools to enable users to migrate existing applications to partitions.   Partitions can be exported from one domain and imported into other domains, simplifying migration of applications across development, test and production environments, and across private and public clouds. Partitions increase application portability and agility, giving development, DevOps, test and production teams more flexibility in how they develop, release and manage production applications.   

      WebLogic Server's new multitenancy features are integrated with innovations introduced across the Oracle JDK, Oracle Coherence, Oracle Traffic Director, and Oracle Fusion Middleware and are closely aligned with Oracle Database Pluggable Databases.   Over time you will see multitenancy being leveraged more broadly in the Oracle Cloud and Oracle Fusion Middleware. Multitenancy represents a significant new innovation for WebLogic Server and for Oracle.

      Continuous Availability

      Oracle WebLogic Server 12.2.1 introduces new features to minimize planned and unplanned downtime in multi data center configurations.    Many WebLogic Server customers have implemented disaster recovery architectures to provide application availability and business continuity.   WebLogic Server's new continuous availability features will make it easier to create, optimize and manage such configurations, while increasing application availability.  They include the following:

      • Zero-downtime patching provides an orchestration framework that controls the application of patches across clusters to maintain application availability during patching. 
      • Live partition migration enables migration of partitions across clusters within the same domain, effectively moving applications from one cluster to another, again while maintaining application availability during the process. 
      • Oracle Traffic Director provides high performance, high availability load balancing, and enables optimized traffic routing and load balancing during patching and partition migration operations. 
      • Cross-site transaction recovery enables automated transaction recovery across active-active configurations when a site fails. 
      • Oracle Coherence 12.2.1 federated caching allows users to replicate data grid updates across clusters and sites to maintain data grid consistency and availability in multi data center configurations.
      • Oracle Site Guard enables the automation and reliable orchestration of failover and failback operations associated with site failures and site switchovers.

      These features, combined with existing availability features in WebLogic Server and Coherence, give users powerful capabilities to meet requirements for business continuity, making WebLogic Server and Coherence the application infrastructure of choice for highly available enterprise environments.   We intend to augment these capabilities over time for use in Oracle Cloud and Oracle Fusion Middleware.

      Developer Productivity and Portability to the Cloud

      Oracle WebLogic Server 12.2.1 enables developers to be more productive, and enables portability of applications to the Cloud.    To empower the individual developer, Oracle WebLogic Server 12.2.1 supports Java SE 8 and the full Java Enterprise Edition 7 (Java EE 7) platform, including new APIs for developer innovation.   We're providing a new lightweight Quick Installer distribution for developers which be easily patched for consistency with test and production systems.  We've improved deployment performance, and updated IDE support provided in Oracle Enterprise Pack for Eclipse, Oracle JDeveloper and Oracle NetBeans.   Improvement for WebLogic Server developers are paired with dramatic new innovations for building Oracle Coherence applications using standard Java SE 8 lambdas and streams.

      Team development and DevOps tools complement the features provided above, providing portability between on-premise environments and the Oracle Cloud.   For example, Eclipse- and Maven-based development and build environments can easily be pushed to the Developer Cloud Service in the Oracle Cloud, to enable team application development in a shared, continuous integration environment. Applications developed in this manner can be deployed to traditional WebLogic Server domains, or to WebLogic Server partitions, and soon to Oracle WebLogic Server 12.2.1 running in the Java Cloud Service.   New cloud management and portability features, such as comprehensive REST management APIs, automated elasticity for dynamic clusters, partition management, and continued Docker certification provide new tools for flexible deployment and management control of applications in production both on premise and in the Oracle Cloud.  

      All these and many other features make Oracle WebLogic Server 12.2.1 a compelling release with technical innovation and business value for customers building Java applications.   Please download the product, review the documentation and evaluate it for yourself.  And be sure check back here for more information from our team.

      Saturday Oct 24, 2015

      Using Orachk for Application Continuity Coverage Analysis

      As I described in the blog Part 2 - 12c Database and WLS - Application continuity, Application Continuity (AC) is a great feature for avoiding errors to the user with minimal changes to the application and configuration. In the blog Using Orachk to Clean Up Concrete Classes for Application Continuity, I described the one of the many uses of the Oracle utility program, focusing on checking for Oracle concrete class usage that needs to be removed to run with AC.

      The download page for Orachk is at

      Starting in version, the Oracle concrete class checking has been enhanced to recursively expand .ear, .war, and .rar files in addition to .jar files. You no longer need to explode these archives into a directory for checking. This is a big simplification for customers using Java EE applications. Just specify the root directory for your application when setting the command line option -appjar dirname or the environment variable RAT_AC_JARDIR. The orachk utility will do the analysis.

      This article will focus a second analysis that can be using Orachk to see if your application workload will be covered by AC. There are three values that control the AC checking (called acchk in orachk) for coverage analysis. Two of them are the same as concrete class checking. The third is different but can be combined to get both checks done in one run. The values can be set either on the command line or via shell environment variable (or mixed). They are the following.

      Command Line Argument

      Shell Environment Variable


      –asmhome jarfilename  


      This must point to a version of asm-all-5.0.3.jar that you download from

      -javahome JDK8dirname


      This must point to the JAVA_HOME directory for a JDK8 installation.

      -apptrc dirname


      To analyze trace files for AC coverage, specify a directory name that contains one or more database server trace files. The trace directory is generally


      This test works 12 database server only since AC was introduced in that release.



      When scanning the trace directory for trace files, this optional value limit the analysis to the specified most recent number of days. There may be thousands of files and this parameter drops files older than the specified number of days.

      In addition, you need to turn on a specific tracing flag on the database server to see RDBMS-side program interfaces for AC.

      You can turn it on programmatically for a single session using a java.sql.Connection by running something like

      Statement statement = conn.createStatement();
      statement.executeUpdate("alter session set events 'trace [progint_appcont_rdbms]' ");

      More likely, you will want to turn it on for all sessions by running

      alter system set event='trace[progint_appcont_rdbms]' scope = spfile ;

      Understanding the analysis requires some understanding of how AC is used. First, it’s only available when using a replay driver, e.g., oracle.jdbc.replay.DataSourceImpl. Second, it’s necessary to identify request boundaries to the driver so that operations can be tracked and potentially replayed if necessary. The boundaries are defined by calling beginRequest by casting the connection to oracle.jdbc.replay.ReplayableConnection, which enables replay, and calling endRequest, which disables replay.

      1. If you are using UCP or WLS, the boundaries are handled automatically when you get and close a connection.

      2. If you aren’t using one of these connection pools that are tightly bound to the Oracle replay driver, you will need to do the calls directly in the application.

      3. If you are using a UCP or WLS pool but you get a connection and hold onto it instead of regularly returning it to the connection pool, you will need to handle the intermediate request boundaries. This is error prone and not recommended.

      4. If you call commit on the connection, replay is disabled by default. If you set the service SESSION_STATE_CONSISTENCY to STATIC mode instead of the default DYNAMIC mode, then a commit does not disable replay. See the first link above for further discussion on this topic. If you are using the default, you should close the connection immediately after the commit. Otherwise, the subsequent operations are not covered by replay for the remainder of the request.

      5. It is also possible for the application to explicitly disable replay in the current request by calling disableReplay() on the connection.

      6. There are also some operations that cannot be replayed and calling one will disable replay in the current request.

      The following is a summary of the coverage analysis.

      - If a round-trip is made to the database server after replay is enabled and not disabled, it is counted as a protected call.

      - If a round-trip is made to the database server when replay has been disabled or replay is inactive (not in a request, or it is a restricted call, or the disable API was called), it is counted as an unprotected call until the next endRequest or beginRequest.

      - Calls that are ignored for the purpose of replay are ignored in the statistics.

      At the end of processing a trace file, it computes

      (protected * 100) / (protected + unprotected)

      to determine PASS (>= 75), WARNING ( 25 <= value <75) and FAIL (< 25).

      Running orachk produces a directory named orachk_<uname>_<date>_<time>. If you want to see all of the details, look for file named o_coverage_classes*.out under the outfiles subdirectory. It has the information for all of the trace files.

      The program generates an html file that is listed in the program output. It drops output related to trace files that PASS (but they might not be 100%). If you PASS but you don’t get 100%, it’s possible that an operation won’t be replayed.

      The output includes the database service name, the module name (from v$session.program, which can be set on the client side using the connection property oracle.jdbc.v$session.program), the ACTION and CLIENT_ID (which can be set using setClientInfo with "OCSID.ACTION" and "OCSID.CLIENTID” respectively).

      The following is an actual table generated by orachk.

      Outage Type



      Coverage checks

      TotalRequest = 25
      PASS = 20
      WARNING = 5
      FAIL = 0


      [WARNING] Trace file name = orcl1_ora_10046.trc Row number = 738
      SERVICE NAME = ( MODULE NAME = (ac_1_bt) ACTION NAME = (qryOrdTotal_SP@alterSess_OrdTot) CLIENT ID = (clthost-1199-Default-3-jdbc000386)
      Coverage(%) = 66 ProtectedCalls = 2 UnProtectedCalls = 1


      [WARNING] Trace file name = orcl1_ora_10046.trc Row number = 31878
      SERVICE NAME = ( MODULE NAME = (ac_1_bt) ACTION NAME = (qryOrder3@qryOrder3) CLIENT ID = (clthost-1199-Default-2-jdbc000183)
      Coverage(%) = 66 ProtectedCalls = 2 UnProtectedCalls = 1


      [WARNING] Trace file name = orcl1_ora_10046.trc Row number = 33240
      SERVICE NAME = ( MODULE NAME = (ac_1_bt) ACTION NAME = (addProduct@getNewProdId) CLIENT ID = (clthost-1199-Default-2-jdbc000183)
      Coverage(%) = 66 ProtectedCalls = 2 UnProtectedCalls = 1


      [WARNING] Trace file name = orcl1_ora_10046.trc Row number = 37963
      SERVICE NAME = ( MODULE NAME = (ac_1_bt) ACTION NAME = (updCustCredLimit@updCustCredLim) CLIENT ID = (clthost-1199-Default-2-jdbc000183-CLOSED)
      Coverage(%) = 66 ProtectedCalls = 2 UnProtectedCalls = 1


      [WARNING] Trace file name = orcl1_ora_32404.trc Row number = 289
      SERVICE NAME = (orcl_pdb1) MODULE NAME = (JDBC Thin Client) ACTION NAME = null CLIENT ID = null
      Coverage(%) = 40 ProtectedCalls = 2 UnProtectedCalls = 3


      Report containing checks that passed: /home/username/orachk/orachk_dbhost_102415_200912/reports/acchk_scorecard_pass.html

      If you are not at 100% for all of your trace files, you need to figure out why. Make sure you return connections to the pool, especially after a commit. To figure out exactly what calls are disabling replay or what operations are done after commit, you should turn on replay debugging at the driver side. This is done by running with the debug driver (e.g., ojdbc7_g.jar), setting the command line options -Doracle.jdbc.Trace=true, and including the following line in the properties file: oracle.jdbc.internal.replay.level=FINEST.

      The orachk utility can help you get your application up and running using Application Continuity.

      - Get rid of Oracle concrete classes.

      - Analyze the database operations in the application to see if any of them are not protected by replay.

      Tuesday Sep 29, 2015

      Multi Data Source Configuration for Database Outages

      Planned Database Maintenance with WebLogic Multi Data Source (MDS)

      This article discusses how to handle planned maintenance on the database server when it is accessed by WebLogic Multi Data Source (MDS) in a fashion that no service interruption occurs.

      To ensure there is no service interruption there must be multiple database instances available so the database can be updated in a rolling fashion.  Oracle technologies to accomplish this include RAC cluster  and GoldenGate or a combination of these products (note that DataGuard cannot be used for planned maintenance without service interruption).  Each database instance is configured as a member generic data source, as described in the product documentation.  This approach assumes that the application is returning connections to the pool on a regular basis.

      Process Overview

      1. On mid-tier systems - Shutdown all member data sources associated with the RAC instance that will be shut down for maintenance. It's important that not all data sources in each MDS list be shutdown so that connections can be reserved on the other member(s). Wait for data source shutdown to complete. See

      2. At this point, it may be desirable to do some work on the database side to reduce remaining connections not associated with WLS data source. For the Oracle database server, this might include stopping (or relocating) the application services at the instances that will be shut down for maintenance, stopping the listener, and/or issue a transactional disconnect for the services on the database instance.  See for more information that is included in the Active GridLink description.

      3. Shutdown the instance immediate using your preferred tools

      4. Do the planned maintenance.

      5. Start up the database instance using your preferred tools

      6. Startup the services when the database instances are ready for application use.

      7. On midtier systems -Start the member data sources. See See

      Shutting down the data source

      Shutting down the data source involves first suspending the data source and then releasing the associated resources including the connections. When a member data source in a MDS is marked as suspended, the MDS will not try to get connections from the suspended pool but will go to the next member data source in the MDS to reserve connections. It's important not all data sources in each MDS list be shut down at the same time. If all members are shut down or fail, then access to the MDS will fail and the application will see failures.

      When you gracefully suspend a data source, which happens as the first step of shut down:

      - the data source is immediately marked as suspended at the beginning of the operation so that no further connections will be created on the data source

      - idle (not reserved) connections are not closed but are marked as disabled.

      - after a timeout period for the suspend operation, all remaining connections in the pool will be marked as suspended and “java.sql.SQLRecoverableException: Connection has been administratively disabled. Try later.” will be thrown for any operations on the connection, indicating that the data source is suspended. These connections remain in the pool and are not closed. We won't know until the data source is resumed if they are good or not. In this case, we know that the database will be shut down and the connections in the pool will not be good if the data source is resumed. Instead, we are doing a data source shutdown which will close all of the disabled connections.

      The timeout period is 60 seconds by default. This can be changed by configuring or dynamically setting Inactive Connection Timeout Seconds to a non-zero value (note that this value is overloaded with another feature when connection leak profiling is enabled). There is no upper limit on the inactive timeout. Note that the processing actually checks for in-use (reserved) resources every tenth of a second so if the timeout value is set to 2 hours and it's done a second later, it will complete a second later.

      Note that this operation runs synchronously; there is no asynchronous version of the mbean operation available. It was designed to run in a short amount of time but testing shows that there is no problem setting it for much longer. It should be possible to use threads in jython if you want to run multiple operations in one script as opposed to lots of script (a jython programmer is needed).

      This procedure works for MDS configured with either Load-Balancing or Failover.

      This is what a WLST script looks like to edit the configuration to increase the suspend timeout and then use the runtime MBean to shutdown a data source. It would need to be integrated into the existing framework for all WLS servers/data sources.

      java weblogic.WLST
      import sys, socket, os
      hostname = socket.gethostname()
      # Edit the configuration to set the suspend timeout

      cd('/JDBCSystemResources/' + datasource + '/JDBCResource/' + datasource + '/JDBCConnectionPoolParams/' + datasource )

      cmo.setInactiveConnectionTimeoutSeconds(21600) # set the suspend timeout


      # Shutdown the data source

      cd('/JDBCServiceRuntime/' + svr + '/JDBCDataSourceRuntimeMBeans/' + datasource )



      Note that if MDS is using a database service, you cannot stop or relocate the service before suspending or shutting down the MDS. If you do,. MDS may attempt to create a connection to the now missing service and it will react as though the database is down and kill all connections, not allowing for a graceful shutdown. Since MDS suspend ensures that no new connections are created at the associated instance (and the MDS only creates connections on this instance, never another instance even if relocated), stopping the service is not necessary for MDS graceful shutdown. Also, since MDS suspend causes all connections to no longer do any operations, no further progress will be made on any sessions (the transactions won't complete) that remain in the MDS pool.

      There is one known problem with this approach related to XA affinity that is enforced by the MDS algorithms. When an XA branch is created on a RAC instance, all additional branches are created on the same instance. While RAC supports XA across instances, there are some significant limitations that applications run into before the prepare so MDS enforces that all operations be on the same instance. As soon as the graceful suspend operation starts, the member data source is marked as suspended so no further connections are allocated there. If an application using global transactions tries to start another branch on the suspending data source, it will fail to get a connection and the transaction will fail. In this case of an XA transaction spanning multiple WLS servers, the suspend is not graceful. This is not a problem for Emulate 1PC or 2pc, which uses a single connection for all work, and LLR.

      If there is a reason to separate the suspending of the data source, at which point all connections are disabled, from the releasing of the resources, it is possible to run suspend followed by forceShutdown. A forced shutdown must be used to avoid going through the waiting period a second time. This separation is not recommended.

      To get a graceful shutdown of the data source when shutting down the database, the data source must be involved. This process of shutting down the data source followed by shutdown of the database requires coordination between the mid-tier and the database server processing. Processing is simplified by using Active GridLink instead of MDS; see the AGL blog included above.

      When using the Oracle database, it is recommended that an application service be configured for each database so that it can be configured to have HA features. By using an application service, it is possible to start up the database without data source starting to use it until the administrator is ready to make it available and the application service is explicitly started.

      Thursday Sep 10, 2015

      Active GridLink Configuration for Database Outages

      This article discusses designing and deploying an Active GridLink (AGL) data source to handle database down times with an Oracle Database RAC environment. 

      AGL Configuration for Database Outages

      It is assumed that an Active GridLink data source is configured as described "Using Active GridLink Data Sources"   with the following.

      - FAN enabled.  FAN provides rapid notification about state changes for database services, instances, the databases themselves, and the nodes that form the cluster.  It allows for draining of work during planned maintenance with no errors whatsoever returned to applications.

      - Either auto-ONS or an explicit ONS configuration.

      - A dynamic database service.  Do not connect using the database service or PDB service – these are for administration only and are not supported for FAN.

      - Testing connections.  Depending on the outage, applications may receive stale connections when connections are borrowed before a down event is processed.  This can occur, for example, on a clean instance down when sockets are closed coincident with incoming connection requests. To prevent the application from receiving any errors, connection checks should be enabled at the connection pool.  This requires setting test-connections-on-reserve to true and setting the test-table (the recommended value for Oracle is “SQL ISVALID”).

      - Optimize SCAN usage.  As an optimization to force re-ordering of the SCAN IP addresses returned from DNS for a SCAN address, set the URL setting LOAD_BALANCE=TRUE for the ADDRESSLIST in database driver and later.   (Before, use the connection property oracle.jdbc.thinForceDNSLoadBalancing=true.)

      Planned Outage Operations

      For a planned downtime, the goals are to achieve:

      - Transparent scheduled maintenance: Make the scheduled maintenance process at the database servers transparent to applications.

      - Session Draining: When an instance is brought down for maintenance at the database server draining ensures that all work using instances at that node completes and that idle sessions are removed. Sessions are drained without impacting in-flight work.  

      The goal is to manage scheduled maintenance with no application interruption while maintenance is underway at the database server.  For maintenance purposes (e.g., software and hardware upgrades, repairs, changes, migrations within and across systems), the services used are shutdown gracefully one or several at a time without disrupting the operations and availability of the WLS applications. Upon FAN DOWN event, AGL drains sessions away from the instance(s) targeted for maintenance. It is necessary to stop non-singleton services running on the target database instance (assuming that they are still available on the remaining running instances) or relocate singleton services from the target instance to another instance.  Once the services have drained, the instance is stopped with no errors whatsoever to applications.

       The following is a high level overview of how planned maintenance occurs.

      –Detect “DOWN” event triggered by DBA on instances targeted for maintenance

      –Drain sessions away from that (those) instance(s)

      –Perform scheduled maintenance at the database servers

      –Resume operations on the upgraded node(s)

      Unlike Multi Data Source where operations need to be coordinated on both the database server and the mid tier, Active GridLink co-operates with the database so that all of these operations are managed from the database server, simplifying the process.  The following table lists the steps that are executed on the database server and the corresponding reactions at the mid tier.

      Database   Server Steps


      Mid Tier Reaction

      Stop the   non-singleton service without   ‘-force’ or relocate the singleton service.  

      Omitting the –server option operates on all   services on the instance.

      $ srvctl   stop service –db <db_name>
      -service <service_name>
      -instance   <instance_name


      $ srvctl   relocate service –db <db_name>
      -service <service_name> -oldinst   <oldins> -newinst <newinst>

      The FAN Planned   Down (reason=USER) event for the service informs the connection pool that a service   is no longer available for use and connections should be drained.  Idle   connections on the stopped service are released immediately.  In-use connections are released when returned   (logically closed) by the application.    New connections are reserved on other  instance(s) and databases offering the   services.  This FAN action invokes draining the sessions from the   instance without disrupting the application.

      Disable   the stopped service to ensure it is   not automatically started again. Disabling the service is optional. This step   is recommended for maintenance actions where the service must not restart   automatically until the action has completed. . 

      $ srvctl   disable service –db <db_name> -service <service_name> -instance   <instance_name>

      No new   connections are associated with the stopped/disabled service at the mid-tier.

      Allow   sessions to drain.

      The   amount of time depends on the application.    There may be long-running queries.    Batch programs may not be written to periodically return connections   and get new ones. It is recommended that batch be drained in advance of the   maintenance.

      Check   for long-running sessions. Terminate   these using a transactional disconnect.    Wait for the sessions to drain.    You can run the query again to check if any sessions remain.

      SQL>   select count(*) from ( select 1 from v$sessionwhere service_name in   upper('<service_name>') union all
      select 1 from v$transaction where   status = 'ACTIVE' )
        SQL> exec
        '<service_name>', DBMS_SERVICE.POST_TRANSACTION);

      The   connection on the mid-tier will get an error.    If using application continuity, it’s possible to hide the error from   the application by automatically replaying the operations on a new connection   on another instance.  Otherwise, the   application will get a SQLException.

      Repeat   the steps above.

      Repeat   for all services targeted for planned maintenance

      Stop the   database instance using the immediate option.

      $ srvctl   stop instance –db <db_name>
      -instance <instance_name> -stopoption   immediate

      No   impact on the mid-tier until the database and service are re-started.

      Optionally   disable the instance so that it will not automatically start again during   maintenance.

      This   step is for maintenance operations where the services cannot resume during   the maintenance.

      $ srvctl   disable instance –db <db_name> -instance <instance_name>

      Perform   the scheduled maintenance work.

      Perform   the scheduled maintenance
      work – patches, repairs and changes.

      Enable   and start the instance.

      $ srvctl   enable instance –db <db_name> -instance <instance_name>
        $ srvctl start instance –db <db_name>
      -instance <instance_name>

      Enable   and start the service back.  Check that   the service is up and running.

      $ srvctl   enable service –db <db_name>
      -service <service_name>
      -instance   <instance_name>

      $ srvctl   start service –db <db_name>
      -service <service_name>
      -instance   <instance_name>

      The FAN   UP event for the service informs the connection pool that a new instance is   available for use, allowing sessions to be created on this instance at the   next request submission.  Automatic   rebalancing of sessions starts.

      The following figure shows the distribution of connections for a service across two RAC instances before and after Planned Downtime.  Notice that the connection workload moves from fifty-fifty across both instances to hundred-zero.  In other words, RAC_INST_1 can be taken down for maintenance without any impact on the business operation.

      Unplanned Outages

      The configuration is the same for planned and unplanned outages.

      There are several differences when an unplanned outage occurs.

      • A component at the database server may fail making all services unavailable on the instances running at that node.  There is not stop or disable on the services because they have failed.
      • The FAN unplanned DOWN event (reason=FAILURE) is delivered to the mid-tier.
      • For an unplanned event, all sessions are closed immediately preventing the application from hanging on TCP/IP timeouts.  Existing connections on other instances remain usable, and new connections are opened to these instances as needed.
      • There is no graceful draining of connections.  For those applications using services that are configured to use Application Continuity, active sessions are restored on a surviving instance and recovered by replaying the operations, masking the outage from applications.  If not protected by Application Continuity, any sessions in active communication with the instance will receive a SQLException.

      Tuesday Sep 01, 2015

      Active GridLink URLs

      Active GridLink (AGL) is the data source type that provides connectivity between WebLogic Server and an Oracle Database service, which may include one or more Oracle RAC clusters or DataGuard sites.  As the supported topologies grow to include additional features like Global Database Services (GDS) and new features are added to the Oracle networking and database support, the complexity of the URL to access this has also gotten more complex. There are lots of examples in the documentation.  This is a short article that summarizes patterns for defining the URL string for use with AGL.

      It should be obvious but let me start by saying AGL only works with the Oracle Thin Driver.

      AGL data sources only support long format JDBC URLs. The supported long format pattern is basically the following (there are lots of additional properties, some of which are described below).


      If not using SCAN, then the ADDRESS_LIST would have one or more ADDRESS attributes with HOST/PORT pairs. It's recommended to use SCAN if possible and it's recommended to use VIP addresses to avoid TCP/IP hangs.

      Easy Connect (short) format URLs are not supported for AGL data sources. The following is an example of a Easy Connect URL pattern that is not supported for use with AGL data sources:


      General recommendations for the URL are as follows.

      - Use a single DESCRIPTION.  Avoid a DESCRIPTION_LIST to avoid connection delays.

      - Use one ADDRESS_LIST per RAC cluster or DataGuard database.

      - Put RETRY_COUNT, RETRY_DELAY, CONNECT_TIMEOUT at the DESCRIPTION level so that all ADDRESS_LIST entries use the same value. 

      - RETRY_DELAY specifies the delay, in seconds, between the connection retries.  It is new in the release.

      - RETRY_COUNT is used to specify the number of times an ADDRESS list is traversed before the connection attempt is terminated. The default value is 0.  When using SCAN listeners with FAILOVER = on, setting the RETRY_COUNT parameter to 2 means the three SCAN IP addresses are traversed three times each, such that there are nine connect attempts (3 * 3).

      - CONNECT_TIMEOUT is used to specify the overall time used to complete the Oracle Net connect.  Set CONNECT_TIMEOUT=90 or higher to prevent logon storms.   Through the JDBC driver, CONNECT_TIMEOUT is also used for the TCP/IP connection timeout for each address in the URL.  This second usage is preferred to be shorter and eventually a separate TRANSPORT_CONNECT_TIMEOUT will be introduced.  Do not set the driver property on the datasource because it is overridden by the URL property.

      - The service name should be a configured application service, not a PDB or administration service.

      - Specify LOAD_BALANCE=on per address list to balance the SCAN addresses.


      The official blog for Oracle WebLogic Server fans and followers!

      Stay Connected


      « November 2015