Wednesday Nov 04, 2015

Local Transaction Leak Profiling for WLS 12.2.1 Datasource

This is the third of this series on profiling enhancements in WLS 12.2.1 (but maybe not the least since this appears to happen quite often). 

This is a common application error that is difficult to diagnose when an application leaves a local transaction open on a connection and it is returned to the connection pool. This error can manifest as XAException/XAER_PROTO errors, or as unintentional local transaction commits or rollbacks of database updates. Current workarounds to internally commit/rollback the local transaction when a connection is released adds significant overhead, only masks errors that may be surfaced to the application, and still leaves the possibility of data inconsistency.

The Oracle JDBC thin driver supports a proprietary method to obtain the local transaction state of a connection. A new profiling option will be added that will generate a log entry when a local transaction is detected on a connection when it is released to the connection pool. The log record will include the call stack and details about the thread releasing the connection.

To enable local transaction leak profiling, the datasource connection pool ProfileType attribute bitmask must include the value (0x000200).

This is a WLST script to set the values.

# java weblogic.WLST prof.py
import sys, socket, os
hostname = socket.gethostname()
datasource='ds'
svr='myserver'
connect("weblogic","welcome1","t3://"+hostname+":7001")
# Edit the configuration to set the leak timeout
edit()
startEdit()
cd('/JDBCSystemResources/' + datasource + '/JDBCResource/' + datasource +
'/JDBCConnectionPoolParams/' + datasource ) 
cmo.setProfileType(0x000200) # turn on  transaction leak profiling
save()
activate()
exit()

Note that you can "or" multiple profile options together when setting the profile type. 

In the administrative console on the Diagnostics Tab, this may be enabled using the Profile Connection Local Transaction Leak checkbox. 

The local transaction leak profile record contains two stack traces, one of the reserving thread and one of the thread at the time the connection was closed. An example log record is shown below.

####<mydatasource> <WEBLOGIC.JDBC.CONN.LOCALTX_LEAK> <Thu Apr 09 15:30:11 EDT 2015> <java.lang.Exception
at weblogic.jdbc.common.internal.ConnectionEnv.setup(ConnectionEnv.java:398)
at weblogic.common.resourcepool.ResourcePoolImpl.reserveResource(ResourcePoolImpl.java:365)
at weblogic.common.resourcepool.ResourcePoolImpl.reserveResource(ResourcePoolImpl.java:331)
at weblogic.jdbc.common.internal.ConnectionPool.reserve(ConnectionPool.java:568)
at weblogic.jdbc.common.internal.ConnectionPool.reserve(ConnectionPool.java:498)
at weblogic.jdbc.common.internal.ConnectionPoolManager.reserve(ConnectionPoolManager.java:135)
at weblogic.jdbc.common.internal.RmiDataSource.getPoolConnection(RmiDataSource.java:522)
at weblogic.jdbc.common.internal.RmiDataSource.getConnectionInternal(RmiDataSource.java:615)
at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:566)
at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:559)
...
> <java.lang.Exception
at weblogic.jdbc.common.internal.ConnectionPool.release(ConnectionPool.java:1064)
at weblogic.jdbc.common.internal.ConnectionPoolManager.release(ConnectionPoolManager.java:189)
at weblogic.jdbc.wrapper.PoolConnection.doClose(PoolConnection.java:249)
at weblogic.jdbc.wrapper.PoolConnection.close(PoolConnection.java:157)
...
> <[partition-id: 0] [partition-name: DOMAIN] >

Once you look at the record, you can see where in the application the close is done and you should complete the transaction appropriately before doing the close.

Closed JDBC Object Profiling for WLS 12.2.1 Datasource

Accessing a closed JDBC object is a common application error that can be difficult to debug. To help diagnose such conditions there is a new profiling option to generate a diagnostic log message when a JDBC object (Connection, Statement or ResultSet) is accessed after the close() method has been invoked. The log message will include the stack trace of the thread that invoked the close() method.

To enable closed JDBC object profiling, the datasource ProfileType attribute bitmask must have the value 0x000400 set.

This is a WLST script to set the value.

# java weblogic.WLST prof.py
import sys, socket, os
hostname = socket.gethostname()
datasource='ds'
svr='myserver'
connect("weblogic","welcome1","t3://"+hostname+":7001")
# Edit the configuration to set the leak timeout
edit()
startEdit()
cd('/JDBCSystemResources/' + datasource + '/JDBCResource/' + datasource +
'/JDBCConnectionPoolParams/' + datasource )
cmo.setProfileType(0x000400) # turn on profiling
save()
activate()
exit()

In the administrative console on the Diagnostics Tab, this may be enabled using the Profile Closed Usage checkbox. 

The closed usage log record contains two stack traces, one of the thread that initially closed the object and another of the thread that attempted to access the closed object. An example record is shown below.

####<mydatasource> <WEBLOGIC.JDBC.CLOSED_USAGE> <Thu Apr 09 15:19:04 EDT 2015> <java.lang.Throwable: Thread[[ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)',5,Pooled Threads]
at weblogic.jdbc.common.internal.ProfileClosedUsage.saveWhereClosed(ProfileClosedUsage.java:31)
at weblogic.jdbc.wrapper.PoolConnection.doClose(PoolConnection.java:242)
at weblogic.jdbc.wrapper.PoolConnection.close(PoolConnection.java:157)
...
> <java.lang.Throwable: Thread[[ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)',5,Pooled Threads]
at weblogic.jdbc.common.internal.ProfileClosedUsage.addClosedUsageProfilingRecord(ProfileClosedUsage.java:38)
at weblogic.jdbc.wrapper.PoolConnection.checkConnection(PoolConnection.java:83)
at weblogic.jdbc.wrapper.Connection.preInvocationHandler(Connection.java:106)
at weblogic.jdbc.wrapper.Connection.createStatement(Connection.java:581)
...
> <[partition-id: 0] [partition-name: DOMAIN] >

When this profiling option is enabled, exceptions indicating that an object is already closed will also include a nested SQLException indicating where the close was done, as shown in the example below.

java.sql.SQLException: Connection has already been closed.
at weblogic.jdbc.wrapper.PoolConnection.checkConnection(PoolConnection.java:82)
at weblogic.jdbc.wrapper.Connection.preInvocationHandler(Connection.java:107)
at weblogic.jdbc.wrapper.Connection.createStatement(Connection.java:582)
at Application.doit(Application.java:156)
...
Caused by: java.sql.SQLException: Where closed: Thread[[ACTIVE] ExecuteThread: ...
at weblogic.jdbc.common.internal.ProfileClosedUsage.saveWhereClosed(ProfileClosedUsage.java:32)
at weblogic.jdbc.wrapper.PoolConnection.doClose(PoolConnection.java:239)
at weblogic.jdbc.wrapper.PoolConnection.close(PoolConnection.java:154)
at Application.doit(Application.java:154)
...

This is very helpful when you get an error indicating that a connection has already been closed and you can't figure out where it was done.  Note that there is overhead in getting the stack trace so you wouldn't normally run with this enabled all the time in production (and we don't default to it always being enabled), but it's worth the overhead when you need to resolve a problem.

Register for Oracle WebLogic Multitenant Webcast

On November 18, 2015 at 10 AM Pacific Time, Oracle will deliver a webcast on Oracle WebLogic Multitenant: The World’s First Cloud-Native, Enterprise Java Platform.  

Although the title focuses on multitenancy, we will cover additional new capabilities in Oracle WebLogic Server and Oracle Coherence 12.2.1.   The webcast will include live chat, demos, and commentary from customers and partners on their planned deployments and benefits, along with product expert deep dives.    

Please register here and take advantage of the opportunity to learn more about using partitions as lightweight microcontainers, consolidating with multitenancy to reduce TCO, leveraging multi-datacenter high availability architectures, and maximizing developer and DevOps productivity for your public and private cloud platforms.  See you in two weeks!

Connection Leak Profiling for WLS 12.2.1 Datasource

This is the first of a series of three articles that describes enhancements to datasource profiling in WLS 12.2.1. These enhancements were requested by customers and Oracle support. I think they will be very useful in tracking down problems in the application.

The pre-12.2.1 connection leak diagnostic profiling option requires that the connection pool “Inactive Connection Timeout Seconds” attribute be set to a positive value in order to determine how long before an idle reserved connection is considered leaked. Once identified as being leaked, a connection is reclaimed and information about the reserving thread is written out to the diagnostics log. For applications that hold connections for long periods of time, false positives can result in application errors that complicate debugging. To address this concern and improve usability, two enhancements to connection leak profiling are available:

1. Connection leak profile records will be produced for all reserved connections when the connection pool reaches max capacity and a reserve request results in a PoolLimitSQLException error.

2. An optional Connection Leak Timeout Seconds attribute will be added to the datasource descriptor for use in determining when a connection is considered “leaked”. When an idle connection exceeds the timeout value a leak profile log message is written and the connection is left intact.

The existing connection leak profiling value (0x000004) must be set on the datasource connection pool ProfileType attribute bitmask to enable connection leak detection. Setting the ProfileConnectionLeakTimeoutSeconds attribute may be used in place of InactiveConnectionTimeoutSeconds for identifying potential connection leaks.

This is a WLST script to set the values.

# java weblogic.WLST prof.py
import sys, socket, os
hostname = socket.gethostname()
datasource='ds'
svr='myserver'
connect("weblogic","welcome1","t3://"+hostname+":7001")
# Edit the configuration to set the leak timeout
edit()
startEdit()
cd('/JDBCSystemResources/' + datasource + '/JDBCResource/' + datasource +
'/JDBCConnectionPoolParams/' + datasource )
cmo.setProfileConnectionLeakTimeoutSeconds(120) # set the connection leak timeout
cmo.setProfileType(0x000004) # turn on profiling
save()
activate()
exit()

This is what the console page looks like after it is set.  Note the profile type and timeout value are set on the Diagnostics tab for the datasource.

The existing leak detection diagnostic profiling log record format is used for leaks triggered by either the ProfileConnectionLeakTimeoutSeconds attribute or when pool capacity is exceeded. In either case a log record is generated only once for each reserved connection. If a connection is subsequently released to pool, re-reserved and leaked again, a new record will be generated. An example resource leak diagnostic log record is shown below.  The output can be reviewed in the console or by looking at the datasource profile output text file.

####<mydatasource> <WEBLOGIC.JDBC.CONN.LEAK> <Thu Apr 09 14:00:22 EDT 2015> <java.lang.Exception
at weblogic.jdbc.common.internal.ConnectionEnv.setup(ConnectionEnv.java:398)
at weblogic.common.resourcepool.ResourcePoolImpl.reserveResource(ResourcePoolImpl.java:365)
at weblogic.common.resourcepool.ResourcePoolImpl.reserveResource(ResourcePoolImpl.java:331)
at weblogic.jdbc.common.internal.ConnectionPool.reserve(ConnectionPool.java:568)
at weblogic.jdbc.common.internal.ConnectionPool.reserve(ConnectionPool.java:498)
at weblogic.jdbc.common.internal.ConnectionPoolManager.reserve(ConnectionPoolManager.java:135)
at weblogic.jdbc.common.internal.RmiDataSource.getPoolConnection(RmiDataSource.java:522)
at weblogic.jdbc.common.internal.RmiDataSource.getConnectionInternal(RmiDataSource.java:615)
at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:566)
at weblogic.jdbc.common.internal.RmiDataSource.getConnection(RmiDataSource.java:559)
...
> <autoCommit=true,enabled=true,isXA=false,isJTS=false,vendorID=100,connUsed=false,doInit=false,'null',destroyed=false,poolname=mydatasource,appname=null,moduleName=null,
connectTime=960,dirtyIsolationLevel=false,initialIsolationLevel=2,infected=false,lastSuccessfulConnectionUse=1428602415037,secondsToTrustAnIdlePoolConnection=10,
currentUser=...,currentThread=Thread[[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)',5,Pooled Threads],lastUser=null,currentError=null,currentErrorTimestamp=null,JDBC4Runtime=true,supportStatementPoolable=true,needRestoreClientInfo=false,defaultClientInfo={},
supportIsValid=true> <[partition-id: 0] [partition-name: DOMAIN] >

For applications that may have connection leaks but also have some valid long-running operations, you will now be able to scan through a list of connections that may be problems without interfering with normal application execution.

Tuesday Nov 03, 2015

Using Eclipse with WebLogic Server 12.2.1

With the installation of WebLogic Server 12.2.1 now including the Eclipse Network Installer, which enables developers to  download and install Eclipse including the specific features of interest, getting up and running with Eclipse and WebLogic Server has never been easier.

The Eclipse Network Installer presents developers with a guided interface to enable the custom installation of an Eclipse environment through the selection of an Eclipse version to be installed and which of the available capabilities are required - such as Java EE 7, Maven, Coherence, WebLogic, WLST, Cloud and Database tools amongst others.  It will then download the selected components and install them directly on the developers machine

Eclipse and the Oracle Enterprise Pack for Eclipse plugins continue to provide extensive support for WebLogic Server enabling it to be used to throughout the software lifecycle; from develop and test cycles with its Java EE dialogs,  assistants and deployment plugins; through to automation of configuration and provisioning of environments with the authoring, debugging and running of scripts using the WLST Script Editor and MBean palette.

The YouTube video WebLogic Server 12.2.1 - Developing with Eclipse provides a short demonstration on how to install Eclipse and the OEPE components using the new Network Installer that is bundled within the WebLogic Server installations.  It then shows the configuring of a new WebLogic Server 12.2.1 server target within Eclipse and finishes with importing a Maven project that contains a Java EE 7 example application that utilizes the new Batch API that is deployed to the server and called from a browser to run.

Monday Nov 02, 2015

Getting Started with the WebLogic Server 12.2.1 Developer Distribution

The new WebLogic Server 12.2.1 release continues down the the path of providing an installation that is smaller to download and able to be installed with a single operation, providing a quicker approach for developers to get started with the product.

New with the WebLogic Server 12.2.1 release is the use of the quick installer technology which packages the product into an executable jar file, which will silently install the product into a target directory.  Through the use of the quick installer, the installed product can now be patched using the standard Oracle patching utility - opatch - enabling developers to download and apply any patches as needed and to also enable a high degree of consistency with downstream testing and production environments.

Despite it's smaller distribution size the developer distribution delivers a full featured WebLogic Server including the rich administration console, the comprehensive scripting environment with WLST, the Configuration Wizard and Domain Builders, the Maven plugins and artifacts and of course all the new WebLogic Server features such as Java EE 7 support, MultiTenancy, Elastic Dynamic Clusters and more.

For a quick look at using the new developer distribution, creating a domain and accessing the administration console, check out the YouTube video: Getting Started with the Developer Distribution.

JMS 2.0 support in WebLogic Server 12.2.1

As part of its support for Java EE 7, WebLogic Server 12.2.1 supports version 2.0 of the JMS (Java Message Service) specification.

JMS 2.0 is the first update to the JMS specification since version 1.1 was released in 2002. One might think that an API that has remained unchanged for so long has grown moribund and unused. However, if you judge the success of an API standard by the number of different implementations, JMS is one of the most successful APIs around.

In JMS 2.0, the emphasis has been on catching up with the ease-of-use improvements that have been made to other enterprise Java technologies. While technologies such as Enterprise JavaBeans or Java persistence are now much simpler to use than they were a decade ago, JMS had remained unchanged with a successful, but rather verbose, API.

The single biggest change in JMS 2.0 is the introduction of a new simplified API for sending and receiving messages that reduces the amount of code a developer must write. For applications that run in WebLogic server itself, the new API also supports resource injection. This allows WebLogic to take care of the creation and management of JMS objects, simplifying the application even further.

Other changes in JMS 2.0 asynchronous send,  shared topic subscriptions and delivery delay. These were existing features WebLogic which are now available using an improved, standard, API.

To find out more about JMS 2.0, see this 15 minute audio-visual slide presentation.

Read these two OTN articles:

See also Understanding the Simplified API Programming Model in the product documentation

In a hurry? See Ten ways in which JMS 2.0 means writing less code.

WebLogic Scripting Tool (WLST) updates in 12.2.1

A number of updates have been implemented in Oracle WebLogic Server and Oracle Fusion Middleware 12.2.1 to simplify the usage of the WebLogic Scripting Tool (WLST), especially when multiple Oracle Fusion Middleware products are being used.    In his blog, Robert Patrick describes what we have done to unify the usage of WLST across the Oracle Fusion Middleware 12.2.1 product line.    This information will be very helpful to WLST users who want to better understand what was implemented in 12.2.1 and any implications for your environments.   

ZDT Patching; A Simple Case – Rolling Restart

To get started understanding ZDT Patching, let’s take a look at it in its simplest form, the rolling restart.  In many ways, this simple use case is the foundation for all of the other types of rollouts – Java Version, Oracle Patches, and Application Updates. Executing the rolling restart requires the coordinated and controlled shutdown of all of the managed servers in a domain or cluster while ensuring that service to the end-user is not interrupted, and none of their session data is lost.

The administrator can start a rolling restart by issuing the WLST command below:

rollingRestart(“Cluster1”)

In this case, the rolling restart will affect all managed servers in the cluster named “Cluster1”. This is called the target. The target can be a single cluster, a list of clusters, or the name of the domain.

When the command is entered, the WebLogic Admin Server will analyze the topology of the target and dynamically create a workflow (also called a rollout), consisting of every step that needs to be taken in order to gracefully shutdown and restart each managed server in the cluster, while ensuring that all sessions on that managed server are available to the other managed servers. The workflow will also ensure that all of the running apps on a managed server are fully ready to accept requests from the end-users before moving on to the next node. The rolling restart is complete once every managed server in the cluster has been restarted.

A diagram illustrating this process on a very simple topology is shown below.  In the diagram you can see that a node is taken offline (shown in red) and end-user requests that would have gone to that node are re-routed to active nodes.  Once the servers on the offline node have been restarted and their applications are again ready to receive requests, that node is added back to the pool of active nodes and the rolling restart moves on to the next node.

Animated image illustrating a rolling restart

 Illustration of a Rolling Restart Across a Cluster.

The rolling restart functionality was introduced based on customer feedback.  Some customers have a policy of preemptively restarting their managed servers in order to refresh the memory usage of applications running on top of them. With this feature we are greatly simplifying that tedious and time consuming process, and doing so in a way that doesn’t affect end-users.

For more information about Rolling Restarts with Zero Downtime Patching, view the documentation.

Friday Oct 30, 2015

Elasticity for Dynamic Clusters

Introducing Elasticity for Dynamic Clusters

WebLogic Server 12.1.2 introduced the concept of dynamic clusters, which are clusters where the Managed Server configurations are based off of a single, shared template.  It greatly simplified the configuration of clustered Managed Servers, and allows for dynamically assigning servers to machine resources and greater utilization of resources with minimal configuration.

In WebLogic Server 12.2.1, we build on the dynamic clusters concept to introduce elasticity to dynamic clusters, allowing them to be scaled up or down based on conditions identified by the user.  Scaling a cluster can be performed on-demand (interactively by the administrator), at a specific date or time, or based on performance as seen through various server metrics.

In this blog entry, we take a high level look at the different aspects of elastic dynamic clusters in WebLogic 12.2.1.0, the next piece in the puzzle for on-premise elasticity with WebLogic Server!  In subsequent blog entries, we will provide more detailed examinations of the different ways of achieving elasticity with dynamic clusters.

The WebLogic Server Elasticity Framework

The diagram below shows the different parts to the elasticity framework for WebLogic Server:

The Elastic Services Framework are a set of services residing within the Administration Server for a for WebLogic domain, and consists of

  • A new set of elastic properties on the DynamicServersMBean for dynamic clusters to establish the elastic boundaries and characteristics of the cluster
  • New capabilities in the WebLogic Diagnostics Framework (WLDF) to allow for the creation of automated elastic policies
  • A new "interceptors" framework to allow administrators to interact with scaling events for provisioning and database capacity checks
  • A set of internal services that perform the scaling
  • (Optional) integration with Oracle Traffic Director (OTD) 12c to notify it of changes in cluster membership and allow it to adapt the workload accordingly

Note that while tighter integration with OTD is possible in 12.2.1, if the OTD server pool is enabled for dynamic discovery, OTD will adapt as necessary to the set of available servers in the cluster.

Configuring Elasticity for Dynamic Clusters

To get started, when you're configuring a new dynamic cluster, or modifying an existing dynamic cluster, you'll want to leverage some new properties surfaced though the DynamicServersMBean for the cluster to set some elastic boundaries and control the elastic behavior of the cluster.

The new properties to be configured include

  • The starting dynamic cluster size
  • The minimum and maximum elastic sizes of the cluster
  • The "cool-off" period required between scaling events

There are several other properties regarding how to manage the shutdown of Managed Servers in the cluster, but the above settings control the boundaries of the cluster (by how many instances it can scale up or down), and how frequently scaling events can occur.  The Elastic Services Framework will allow the dynamic cluster to scale up to the specified maximum number of instances, or down to the minimum you allow.  

The cool-off period is a safety mechanism designed to prevent scaling events from occurring too frequently.  It should allow enough time for a scaling event to complete and for its effects to be felt on the dynamic cluster's performance characteristics.

Needless to say, the values for these settings should be chosen carefully and aligned with your cluster capacity planning!

Scaling Dynamic Clusters

Scaling of a dynamic cluster can be achieved through the following means:

  • On-demand through WebLogic Server Administration Console and WLST 
  • Using an automated calendar-based schedule utilizing WLDF policies and actions
  • Through automated WLDF policies based on performance metrics

On-Demand Scaling

WebLogic administrators have the ability to scale a dynamic cluster up or down on demand when needed:

Manual Scaling using the WebLogic Server Administration Console

In the console case, the administrator simply indicates the total number of desired running servers in the cluster, and the Console will interact with the Elastic Services Framework to scale the cluster up or down accordingly, within the boundaries of the dynamic cluster.

Automated Scaling

In addition to scaling a dynamic cluster on demand, WebLogic administrators can configure automated polices using the Polices & Actions feature (known in previous releases as the Watch & Notifications Framework) in WLDF.

Typically, automated scaling will consist of creating pairs of WLDF policies, one for scaling up a cluster, and one for scaling it down.  Each scaling policy consists of 

  • (Optionally) A policy (previously known as a "Watch Rule") expression
  • A schedule
  • A scaling action

To create an automated scaling policy, an administrator must

  • Configure a domain-level diagnostic system module and target it to the Administration Server
  • Configure a scale-up or scale-down action for a dynamic cluster within that WLDF module
  • Configure a policy and assign the scaling action

For more information you can consult the documentation for Configuring Policies and Actions.

Calendar Based Elastic Policies

In 12.2.1, WLDF introduces the ability for cron-style scheduling of policy evaluations.  Policies that monitor MBeans according to a specific schedule are called "scheduled" policies.  

A calendar based policy is a policy that unconditionally executes according to its schedule and executes any associated actions.   When combined with a scaling action, you can create a policy that can scale up or scale down a dynamic cluster at specific scheduled times.

Each scheduled policy type has its own schedule (as opposed to earlier releases, which were tied to a single evaluation frequency) which is configured in calendar time, and allowing the ability to create the schedule patterns such as (but not limited to):

  • Recurring interval based patterns (e.g., every 5th minute of the hour, or every 30th second of every minute)
  • Days-of-week or days-of-month (e.g., "every Mon/Wed/Fri at 8 AM", or "every 15th and 30th of every month")
  • Specific days and times within a year  (e.g., "December 26th at 8AM EST")

So, for example, an online retailer could configure a pair of policies around the Christmas holidays:

  • A "Black Friday" policy to scale up the necessary cluster(s) to meet increased shopping demand for the Christmas shopping season
  • Another policy to scale down the cluster(s) on December 25th when the Christmas shopping season is over

Performance-based Elastic Policies

In addition to calendar-based scheduling, in 12.2.1 WLDF provides the ability to create scaling policies based on performance conditions within a server ("server-scoped") or cluster ("cluster-scoped").  You can create a policy based on various run-time metrics supported by WebLogic Server.  WLDF also provides a set of pre-packaged, parameterized, out-of-the-box functions called "Smart Rules" to assist in creating performance-based policies.

Cluster-scoped Smart Rules allow you to look at trends in a performance metric across a cluster over a specified window of time and (when combined with scaling actions) scale up or down based on criteria that you specify.  Some examples of the metrics that are exposed through Smart Rules include:

  • Throughput (requests/second)
  • JVM Free heap percentage
  • Process CPU Load
  • Pending user requests
  • Idle threads count
  • Thread pool queue length

Additionally, WLDF provides some "generic" Smart Rules to allow you to create policies based on your own JMX-based metrics.  The full Smart Rule reference can be found here.

And, if a Smart Rule doesn't suit your needs, you can also craft your own policy expressions.  In 12.2.1, WLDF utilizes Java EL 3.0 as the policy expression language, and allows you to craft your own policy expressions based on JavaBean objects and functions (including Smart Rules!) that we provide out of the box.  

Provisioning and Safeguards with Elasticity

What if you need to add or remove virtual machines during the scaling process?  In WLS 12.2.1 you can participate in the scaling event utilizing script interceptors.  A script interceptor provides call-out hooks where you can supply custom shell scripts, or other executables, to be called when a scaling event happens on a cluster.  In this manner, you can write a script to interact with 3rd-party virtual machine hypervisors to add virtual machines prior to scaling up, or remove/reassign virtual machines after scaling down. 

WebLogic Server also provides administrators the ability to prevent overloading database capacity on a scale up event through the data source interceptor feature.  Data source interceptors allow you to set a value for the maximum number of connections allowed on a database, by associating a set of data source URLs and URL patterns with a maximum connections constraint.   When a scale up is requested on a cluster, the data source interceptor looks at what the new maximum connection requirements are for the cluster (with the additional server capacity), and if it looks like the scale up could lead to a database overload it rejects the scale up request.  While this still requires adequate capacity planning for your database utilization, it allows you to put in some sanity checks at run time to ensure that your database doesn't get overloaded by a cluster scale up.

Integration with Oracle Traffic Director

The elasticity framework also integrates with OTD through the WebLogic Server 12.2.1 life cycle management services.  When a scaling event occurs, the elasticity framework interacts with the life cycle management services to notify OTD of the scaling event so that OTD can update its routing tables accordingly.

In the event of a scale up event, for example, OTD is notified of the candidate servers and adjusts the server pool accordingly.  

In the case of a scale down, the life cycle management services notifies OTD which instances are going away.  OTD then halts sending new requests to the servers being scaled down, and routs new traffic to the remaining set of instances in the cluster, allowing the instances to be removed to be shutdown gracefully without losing any requests.

In order for OTD integration to be active, you must enable life cycle management services for the domain as documented here.

The Big Picture - Tying It All Together

The elasticity framework in 12.2.1 provides a lot of power and flexibility to manage the capacity in your on-premise dynamic clusters.  As part of your dynamic cluster capacity planning, you can use elasticity take into account your dynamic cluster's minimum, baseline, and peak capacity needs, and incorporate those settings into your dynamic servers configuration on the cluster.  Utilizing WLDF policies and actions, you can create automated policies to scale your cluster at times of known increased or decreased capacity, or to scale up or down based on cluster performance.

Through the use of script interceptors, you can interact with virtual machine pools to add or remove virtual machines during scaling, or perhaps even move shared VMs between clusters based on need.  You can also utilize the data source interceptor to prevent exceeding the capacity of any databases affected by scale up events.

And, when so configured, the Elasticity Framework can interact with OTD during scaling events to ensure that new and in-flight sessions are managed safely when adding or removing capacity in the dynamic cluster.

In future blogs (and maybe vlogs!) we'll go into some of the details on these features.  This is really just an overview the new features that are available to help our users implement elasticity with dynamic clusters.  We will follow on in the upcoming weeks and months with more detailed discussions and examples of how to utilize these powerful new features.

In the meantime, you can download a demonstration of policy based scaling with OTD integration from here, with documentation about how to set it up and run it here

Feel free to post any questions you have here, or email me directly.  In the meantime, download WebLogic Server 12.2.1 and start poking around! 

Resources

Policy Based Scaling demonstration files and documentation

WebLogic Server 12.2.1 Documentation

Configuring Elasticity for Dynamic Clusters in Oracle WebLogic Server

Configuring WLDF Policies and Actions

Dynamic Clusters Documentation

End-To-End Life Cycle Management and Configuring WebLogic Server MT: The Big Picture

Oracle Traffic Director (OTD) 12c

Java EL 3.0 Specification

Thursday Oct 29, 2015

Oracle WebLogic Server 12.2.1 Continuous Availability

New in Oracle WebLogic Server 12.2.1, Continuous Availability! Continuous Availability is an end to end solution for building Multi Data Center architectures. With Continuous Availability, applications running in multi data center environments can run in Active-Active environments continuously. When one site fails the other site will recover work for the failed site. During upgrades, applications can still run continuously with zero down time. What ties it all together is automated data site failover, reducing human error and risk during failover or switchover events.

Reduce Application Downtime

· WebLogic Zero Down Time Patching (ZDT): Automatically orchestrates the rollout of patches and updates, while avoiding downtime and session loss. Reduces risk, cost and session downtime by automating the rollout process. ZDT automatically retries on failure and rollsback on retry failure retry.  Please read the blog Zero Downtime Patching Released!  to learn more about this feature.

· WebLogic Multitenant Live Partition Migration: In Multitenant environments Live Partition Migration is the ability to move running partitions and resource groups from one cluster to another, without impacting application users. During upgrade, load balancing, or imminent failure partitions can be migrated with zero impact to applications.

· Coherence Persistence: Persists cache data and metadata to durable storage. In case of failure of one or more Coherence servers, or the entire cluster, the persisted data and metadata can be recovered.


Replicate State for Multi-Datacenter Deployments

· WebLogic Cross Domain XA Recovery: When a WebLogic Server domain fails in one site or the entire site comes down, the ability to automatically recover transactions in a domain on the surviving site. This allows automated transaction recovery in Active-Active Maximum Availability Architectures.

· Coherence Federated Caching: Distributes Coherence updates across distributed geographical sites with conflict resolution. The modes of replication are Active-Active with data being continuously replicated and providing applications access to their local cached data, Active-Passive with the passive site serving as backup of the active site, and Hub Spoke where the Hub replicates the cache data to distributed Spokes.


Operational Support for Site Failover

· Oracle Traffic Director (OTD): Fast, reliable, and scalable software load balancer that routes traffic to application servers and web servers in the network. Oracle Traffic Director is aware of server availability, when a server is added to the cluster OTD starts routing traffic to that server. OTD itself can be highly available either in Active-Active or Active-Passive mode.

· Oracle Site Guard: Provides end-to-end Disaster Recovery automation. Oracle Site Guard automates failover or switchover by starting stopping site components in a predetermined order, running scripts and post failover checks. Oracle Site guard minimizes down time and human error during failover or switchover.


Continuous Availability provides flexibility by supporting different topologies to meet application needs.

· Active-Active Application Tier with Active-Passive Database Tier

· Active-Passive Application Tier with Active-Passive Database Tier

· Active-Active Stretch Cluster with Active-Passive Database Tier


Continuous Availability provide applications with Maximum Availability and Productivity, Data Integrity and Recovery, Local Access to data in multi data center environments, Real Time access to data updates, Automated Failover and Switchover of sites, and Reduce Human Error and Risk during failover/switchover. Protect your applications from down time with Continuous Availability. If you want to learn more please read Continuous Availability documentation or watch the Continuous Availability video.

Dynamic Debug Patches in WebLogic Server 12.2.1

Introduction

Whether we like it not, we know that no software is perfect. Bugs happen, in spite of the best efforts by the developers. Worse, in many circumstances, they show up in unexpected ways. They can also be intermittent and hard to reproduce in some cases. In such cases, there is often not enough information even to understand the nature of the problem if the product is not sufficiently instrumented to reveal the underlying causes. Direct access to a customer's production environment is usually not an option. To get better understanding of the underlying problem, instrumented debug patches are usually created with the hope that running the applications with debug patches would provide more insight. This can be a trial and error method and can take several iterations before hitting upon the actual cause. The folks creating a debug patch (typically Support or Development teams in the software provider organization) and the customers running the application are almost always different groups, often in different companies. Thus, each iteration of creating a debug patch, providing it to the customer, getting it applied in customer environment and getting the results back can take substantial time. In turn, it can result in delays in problem resolution.

In addition, there can be other significant issues with deploying such debug patches. Applying patches in a Java EE environment requires bouncing servers and domains or at least redeploying applications. In mission critical deployments, it may not be possible to immediately apply patches. Moreover, when a server is bounced, its state is lost. Thus, vital failure data in memory may be lost. Also, an intermittent failure may not show up for a long time after restarting servers, making quick diagnosis difficult.

Dynamic Debug Patches

In the WebLogic Server 12.2.1 release, a new feature called Dynamic Debug Patches is introduced which aims to simplify the process of capturing diagnostic data for quicker problem resolution. With this feature, debug patches can be dynamically activated without having to restart servers or clusters or redeploy applications in a WebLogic domain. It leverages the JDK's instrumentation feature to hot-swap classes from specified debug patches using run-time WLST commands. With provided WLST commands (as described below), one or more debug patches can be activated within the scope of selected servers, clusters, partitions and applications. Since no server restart  or application redeployment is needed, associated logistical impediments are a non-issue. For one, since the applications and services continue to run, there is less of a barrier to activate these patches in production environments. Also, there is no loss of state. Thus, the instrumented code in newly activated debug patches have a better chance at revealing erroneous transient state and providing meaningful diagnostic information.

Prerequisites

Dynamic debug patches are ordinary jar files containing patched classes with additional instrumentation such as debug logging, print statements, etc. Typically, product development or support teams build these patch jars and make them available to system operations teams for their activation in the field. To make them available to the WebLogic Server's dynamic debug patches feature, system administrators need to copy them to a specific directory in a domain. By default, this directory is the debug_patches subdirectory under the domain root. However, it can be changed by reconfiguring the DebugPatchDirectory attribute of the DebugPatchesMBean.

Another requirement is to start the servers in the domain with the debugpatch instrumentation agent with the following option in the server's startup command. It is automatically added by the startup scripts created for WebLogic Server 12.2.1 domains.

-javaagent:${WL_HOME}/server/lib/debugpatch-agent.jar

Using Dynamic Debug Patches Feature

We will illustrate the use of this feature by activating and deactivating debug patches on a simple toy application.

The Application

We will use a minimalist toy web application which computes the factorial value of an input integer and returns it to the browser.

FactorialServlet.java:

package example;

import java.io.IOException;
import javax.servlet.GenericServlet;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.annotation.WebServlet;

import java.util.Map;
import java.util.HashMap;
import java.util.concurrent.ConcurrentHashMap;

/**
 * A trivial servlet: Returns addition of two numbers.
 */
@WebServlet(value="/factorial", name="factorial-servlet")
public class FactorialServlet extends GenericServlet {

  public void service(ServletRequest request, ServletResponse response)
      throws ServletException, IOException {
    String n = request.getParameter("n");
    System.out.println("FactorialServlet called for input=" + n);
    int result = Factorial.getInstance().factorial(n);
    response.getWriter().print("factorial(" + n + ") = " + result);
  }
}

The servlet delegates to the Factorial singleton to compute the factorial value. As an optimization, the Factorial class maintains a Map of previously computed values which serves as an illustration of retaining stateful information while activating or deactivating dynamic debug patches.

Factorial.java:

package example;

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

class Factorial {
  private static final Factorial SINGLETON = new Factorial();
  private Map<String, Integer> map = new ConcurrentHashMap<String, Integer>();

  static Factorial getInstance() {
    return SINGLETON;
  }

  public int factorial(String n) {
    if (n == null) {
      throw new NumberFormatException("Invalid argument: " + n);
    }
    n = n.trim();
    Integer val = map.get(n);
    if (val == null) {
      int i = Integer.parseInt(n);
      if (i < 0)
        throw new NumberFormatException("Invalid argument: " + n);
      int fact = 1;
      while (i > 0) {
        fact *= i;
        i--;
      }
      val = new Integer(fact);
      map.put(n, val);
    }
    return val;
  }
}

Building and Deploying the Application

To build the factorial.war web application, create FactorialServlet.java and Factorial.java files as above in an empty directory. Build the application war files with following commands:

mkdir -p WEB-INF/classes
javac -d WEB-INF/classes FactorialServlet.java Factorial.java
jar cvf factorial.war WEB-INF

Deploy the application using WLST (or WebLogic Server Administration Console):

$MW_HOME/oracle_common/common/bin/wlst.sh
Initializing WebLogic Scripting Tool (WLST) ...
Welcome to WebLogic Server Administration Scripting Shell
Type help() for help on available commands
connect(username, password, adminUrl)  # e.g. connect('weblogic', 'weblogic', 't3://localhost:7001')
Connecting to t3://localhost:7001 with userid weblogic ...
Successfully connected to Admin Server "myserver" that belongs to domain "mydomain".
Warning: An insecure protocol was used to connect to the server.
To ensure on-the-wire security, the SSL port or Admin port should be used instead.
deploy('factorial', 'factorial.war', targets='myserver')

Note that in the illustration above, we targeted the application only to the administration server. It may have been targeted to other managed servers or clusters in real world. We will discuss how to activate and deactivate debug patches over multiple managed servers and clusters in a subsequent article.

Invoke the web application from your browser. For example: http://localhost:7001/factorial/factorial?n=4 You should see the result in the browser and a message in the server's stdout window such as:

FactorialServlet called for input=4

The Debug Patch

The application as written does not perform lot of logging and does not reveal much about its functioning. Perhaps there is a problem and we need more information when it executes. We can create a debug patch from the application code and provide it to the system administrator so he/she can activate it on the running server/application. Let us modify above code to put additional print statements for getting additional information (i.e. the lines with "MYDEBUG" below).

Updated (version 1)  Factorial.java:

class Factorial {
  private static final Factorial SINGLETON = new Factorial();
  private Map<String, Integer> map = new ConcurrentHashMap<String, Integer>();
  static Factorial getInstance() {
    return SINGLETON;
  }
  public int factorial(String n) {
    if (n == null) {
      throw new NumberFormatException("Invalid argument: " + n);
    }
    n = n.trim();
    Integer val = map.get(n);
    if (val == null) {
      int i = Integer.parseInt(n);
      if (i < 0)
        throw new NumberFormatException("Invalid argument: " + n);
      int fact = 1;
      while (i > 0) {
        fact *= i;
        i--;
      }
      val = new Integer(fact);
      System.out.println("MYDEBUG> saving factorial(" + n + ") = " + val);
      map.put(n, val);
    } else {
      System.out.println("MYDEBUG> returning saved factorial(" + n + ") = " + val);
    }
    return val;
  }
}

Build the debug patch jar. Note that this a plain jar file, that is, not built as an application archive. Also note that we need not compile the entire application (although it would not hurt). The debug patch jar should contain only the classes which have changed (in this case, Factorial.class).

mkdir patch_classes
javac -d patch_classes Factorial.java
jar cvf factorial_debug_01.jar -C patch_classes

Activating Debug Patches

In most real world scenarios, creators (developers) and activators (system administraors) of debug patches would be different people. For the purpose of illustration, we will wear multiple hats here. Assuming that we are using the default configuration for the location of the debug patches directory, create the debug_patches directory under the domain directory if it is not already there. Copy factorial_debug_01.jar debug patch jar into the debug_patches directory.  Connect to the server with WLST as above.

First, let us check which debug patches are available in the domain. This can be done with the listDebugPatches command.

Hint: To see available diagnostics commands, issue help('diagnostics') command. To get information on specific command, issue help(commandName), e.g. help('activateDebugPatch').

wls:/mydomain/serverConfig/> listDebugPatches()         
myserver:
Active Patches:
Available Patches:
    factorial_debug_01.jar
    app2.0_patch01.jar
    app2.0_patch02.jar 

factorial_debug_01.jar is the newly created debug patch. app2.0_patch01.jar and app2.0_patch02.jar were created in the past to investigate issues with some other application. The listing above shows no "active" patches since none have been activated so far.

Now, let us activate the debug patch with the activateDebugPatch command.

tasks=activateDebugPatch('factorial_debug_01.jar', app='factorial', target='myserver')
wls:/mydomain/serverConfig/> print tasks[0].status                                                                 
FINISHED
wls:/mydomain/serverConfig/> listDebugPatches()     
myserver:
Active Patches:
    factorial_debug_01.jar:app=factorial
Available Patches:
    factorial_debug_01.jar
    app2.0_patch01.jar
    app2.0_patch02.jar

The command returns an array of tasks which can be used to monitor the progress and status of activation command. Multiple managed servers and/or clusters can be specified as targets if applicable. Corresponding to each applicable target server, there is a task in the returned tasks array. The command can also be used to activate debug patches at the server and middleware level as well. Such patches will be typically created by Oracle Support as needed. Output of listDebugPatches() command above shows that factorial_debug_01.jar is now activated on application "factorial".

Now, let us send some requests to the application: http://localhost:7001/factorial/factorial?n=4 and http://localhost:7001/factorial/factorial?n=5

Server output:

FactorialServlet called for input=4
MYDEBUG> returning saved factorial(4) = 24
FactorialServlet called for input=5
MYDEBUG> saving factorial(5) = 120

Notice that for input=4, saved results were returned since the values were computed and saved in the map due to a prior request. Thus, the debug patch was activated without destroying existing state in the application. For input=5, values were not previously computed and saved, thus a different debug message showed up.

Activating Multiple Debug Patches

If needed, multiple patches which potentially overlap can be activated. A patch which is activated later would mask the effects of a previously activated patch if there is an overlap. Say, in the above case, we need more detailed information from the factorial() method as it is executing its inner loop. Let us create another debug patch, copy it to debug_patches directory and activate it.

Updated (version 2) Factorial.java:

class Factorial {
  private static final Factorial SINGLETON = new Factorial();
  private Map<String, Integer> map = new ConcurrentHashMap<String, Integer>();
  static Factorial getInstance() {
    return SINGLETON;
  }
  public int factorial(String n) {
    if (n == null) {
      throw new NumberFormatException("Invalid argument: " + n);
    }
    n = n.trim();
    Integer val = map.get(n);
    if (val == null) {
      int i = Integer.parseInt(n);
      if (i < 0)
        throw new NumberFormatException("Invalid argument: " + n);
      int fact = 1;
      while (i > 0) {
        System.out.println("MYDEBUG> multiplying by " + i);
        fact *= i;
        i--;
      }
      val = new Integer(fact);
      System.out.println("MYDEBUG> saving factorial(" + n + ") = " + val);
      map.put(n, val);
    } else {
      System.out.println("MYDEBUG> returning saved factorial(" + n + ") = " + val);
    }
    return val;
  }
}

Build factorial_debug_02.jar

javac -d patch_classes Factorial.java
jar cvf factorial_debug_02.jar  -C patch_classes .
cp factorial_debug_02.jar $DOMAIN_DIR/debug_patches

Activate factorial_debug_02.jar

wls:/mydomain/serverConfig/> listDebugPatches()     
myserver:
Active Patches:
    factorial_debug_01.jar:app=factorial
Available Patches:
    factorial_debug_01.jar
    factorial_debug_02.jar
    app2.0_patch01.jar
    app2.0_patch02.jar
wls:/mydomain/serverConfig/> tasks=activateDebugPatch('factorial_debug_01.jar', app='factorial', target='myserver')
wls:/mydomain/serverConfig/> listDebugPatches()                                                                    
myserver:
Active Patches:
    factorial_debug_01.jar:app=factorial
    factorial_debug_02.jar:app=factorial
Available Patches:
    factorial_debug_01.jar
    factorial_debug_02.jar
    servlet3.0_patch01.jar
    servlet3.0_patch02.jar

Now, let us send some requests to the application: http://localhost:7001/factorial/factorial?n=5 and http://localhost:7001/factorial/factorial?n=6

FactorialServlet called for input=5
MYDEBUG> returning saved factorial(5) = 120
FactorialServlet called for input=6
MYDEBUG> multiplying by 6
MYDEBUG> multiplying by 5
MYDEBUG> multiplying by 4
MYDEBUG> multiplying by 3
MYDEBUG> multiplying by 2
MYDEBUG> multiplying by 1
MYDEBUG> saving factorial(6) = 720

We see the additional information printed due to code in factorial_debug_02.jar.

Deactivating Debug Patches

When the debug patch is not needed any more, it can be deactivated deactivateDebugPatches command. To get help on it, execute help('deactivateDebugPatches').

wls:/mydomain/serverConfig/> tasks=deactivateDebugPatches('factorial_debug_02.jar', app='factorial', target='myserver')            
wls:/mydomain/serverConfig/> listDebugPatches()                                                                        
myserver:
Active Patches:
    factorial_debug_01.jar:app=factorial
Available Patches:
    factorial_debug_01.jar
    factorial_debug_02.jar
    servlet3.0_patch01.jar
    servlet3.0_patch02.jar

Now, executing http://localhost:7001/factorial/factorial?n=2 gets us the following output in server's stdout window:

FactorialServlet called for input=2
MYDEBUG> saving factorial(2) = 2

Note that when we had activated factorial_debug_01.jar and factorial_debug_02.jar in that order, the classes in factorial_debug_02.jar masked those in factorial_debug_01.jar. After deactivating factorial_debug_02.jar, the classes in factorial_debug_01.jar got unmasked and became effective again. A list of comma separated list of debug patches may be specified with the deactivateDebugPatches command. To deactivate all active debug patches on applicable target servers, deactivateAllDebugPatches() command may be used.

WLST Commands

Following diagnostic WLST commands are provided to interact with the Dynamic Debug Patches feature. As noted above, help(command-name) shows help for that command.

 Command  Description
activateDebugPatch Activate a debug patch on specified targets.
deactivateAllDebugPatches De-activate all debug patches from specified targets.
deactivateDebugPatches De-activate debug patches on specified targets.
listDebugPatches List activated and available debug patches on specified targets.
listDebugPatchTasks List debug patch tasks from specified targets.
purgeDebugPatchTasks Purge debug patch tasks from specified targets.
showDebugPatchInfo  Show details about a debug patch on specified targets.

Limitations

The Dynamic Debug Patch features leverages JDK's hot-swap feature. The hot-swap feature has a limitation that how-swapped classes cannot have a different shape than the original classes. This means that the classes which are swapped in cannot add, remove or update constructors, methods, fields, super classes, implemented interfaces, etc. Only changes in method bodies are allowed. It should be noted that debug patches typically only gather additional information and not attempt to "fix" the problems as such. Minor fixes which would not change the shape of classes may be tried, but that is not the main purpose of this feature. Therefore, we don't expect this to be a big limitation in practice.

One issue, however is that, in some cases the new debug code may need to maintain some state. For example, perhaps we want to collect some data in some map and only dump it out on some threshold. The JDK limitation regarding shape-change creates problems in that case. The Dynamic Debug Patches provides a DebugPatchHelper utility class to help address some of those concerns. We will discuss that in a subsequent article. Please visit us back to read about it.

Using Diagnostic Context for Correlation

The WebLogic Diagnostics Framework (WLDF) and Fusion Middleware Diagnostics Monitoring System (DMS) provide correlation information in diagnostic artifacts such as logs and Java Flight Recorder (JFR).

The correlation information flows along with a Request across threads within and between WebLogic server processes, and can also flow across process boundaries to/from other Oracle products (such as from OTD or to the Database). This correlation information is exposed in the form of unique IDs which can be used to identify and correlate the flow of a specific request through the system. This information can also provide details on the ordering of the flow as well.

The correlation IDs are described as follows:

  • DiagnosticContextID (DCID) and ExecutionContextID (ECID). This is the unique identifier which identifies the Request flowing through the system. While the name of the ID may be different depending on whether you are using WLDF or DMS, it is the same ID. I will be using the term ECID as that is the name used in the broader set of Oracle products.
  • Relationship ID (RID). This ID is used to describe where in the overall flow (or tree) the Request is currently at. The ID itself is an ordered set of numbers that describes the location of each task in the tree of tasks. The leading number is usually a zero. A leading number of 1 indicates that it has not been possible to track the location of the sub-task within the overall sub-task tree.

These correlation IDs have been around for quite a long time, what is new in 12.2.1 is that WLDF now picks up some capabilities from DMS (even when DMS is not present):

  1) The RelationshipID (RID) feature from DMS is now supported
  2) The ability to handle correlation information coming in over HTTP
  3) The ability to propagate correlation out over HTTP when using the WebLogic HTTP client
  4) The concept of a non-inheritable Context (not covered in this blog, may be the topic of another blog)

For this blog, we will walk through a simple contrived scenario to show how an administrator can make use of this correlation information to quickly find the data available related to a particular Request flow. This diagram shows the basic scenario:


Each arrow in the diagram shows where a Context propagation could occur, however in our example propagation occurs only where we have solid blue arrows. The reason for this is that in our example we are using a Browser client which does not supply a Context, so for our example case the first place where a Context is created is when MySimpleServlet is called. Note that a Context could propagate into MySimpleServlet if it is called by a clients capable of providing the Context (for example, a DMS enabled HTTP client, a 12.2.1+ WebLogic HTTP client, or OTD).

In our contrived applications, we have each level querying the value of the ECID/RID using the DiagnosticContextHelper API, and the servlet will report these values. A real application would not be doing this, this is just for our example purposes so our servlet can display them.

We also have the EJB hard-coded to throw an Exception if the servlet request was supplied with a query string. The application will log warnings when that is detected, the warning log messages will automatically get the ECID/RID values included in them. The application does not need to do anything special to get them.

The applications used here as well as well as some basic instructions are attached in blog_example.zip.

First we will show hitting our servlet with an URL that is not expected to fail (http://myhost:7003/MySimpleServlet/MySimpleServlet):




From the screen shot above we can see that all of the application components are reporting the same ECID (f7cf87c6-9ef3-42c8-80fa-e6007c56c21f-0000022f). We also can see that the RID being reported by each components here are different and show the relationship between each of the components:


Next we will show hitting our servlet with an URL that is expected to fail (http://myhost:7003/MySimpleServlet/MySimpleServlet?fail):

We see that the EJB reported that it failed. In our contrived example app, we can see that the ECID is for the entire flow where the failure occured was "f7cf87c6-9ef3-42c8-80fa-e6007c56c21f-00000231". In a real application, that would not be the case. An administrator would most likely first see warnings reported in the various server logs, and see the ECID reported with those warnings. Since we know the ECID in this case, we can "grep" for it to show what those warnings would look like and that they have ECID/RID reported in them:

Upon seeing that we had a failure, the admin will capture JFR data from all of the servers involved. In a real scenario, the admin may have noticed the warnings in the logs, or perhaps had a Policy/Action (formerly known as Watch/Notification) configured to automatically notify or capture data. For our simple example, a WLST script is included to capture the JFR data.


The assumption is that folks here are familiar with JFR and Java Mission Control (JMC), also that they have installed the WebLogic Plugin for JMC (video on installing the plugin)

Since we have an ECID in hand already related to the failure (in a real case this would be from the warnings in the logs), we will pull up the JFR data in JMC and go directly to the "ECIDs" tab in the "WebLogic" tab group. This tab initially shows us an unfiltered view from the AdminServer JFR, which includes all ECIDs present in that JFR recording:

Next we will copy/paste the ECID "f7cf87c6-9ef3-42c8-80fa-e6007c56c21f-00000231" into the "Filter Column" for "ECID":


With only the specific ECID displayed, we can select that and see the JFR events that are present in the JFR recording related to that ECID. We can right-click to add those associated events to the "operative set" feature in JMC. Once in the "operative set" other views in JMC can also be set to show only the operative set as well, see Using WLDF with Java Flight Recorder for more information.

Here we see screen shots showing the same filtered view for the ejbServer and webappServer JFR data:



In our simple contrived case, the failure we forced was entirely within application code. As a result, the JFR data we see here shows us the overall flow for example purposes, but it is not going to give us more insight into the failure in this specific case itself. In cases where something that is covered by JFR events caused a failure, it is a good way to see what failed and what happened leading up to the failure.

For more related information, see:

Wednesday Oct 28, 2015

Zero Downtime Patching Released!

Patching and Updating WebLogic servers just got a whole lot easier!  The release of Zero Downtime Patching marks a huge step forward in Oracle's commitment both to simplifying the maintenance of WebLogic servers, and to our ability to provide continuous availability.

Zero Downtime Patching allows you to rollout distributed patches to multiple clusters or to your entire domain with a single command. All without causing any service outages or loss of session data for the end-user. It takes what was once a tedious and time-consuming task and replaces it with a consistent, efficient, and resilient automated process.

By automating this process, we're able to drastically reduce the amount of human input required (errors), and we're able to verify the input that is given before making any changes. This will have a huge impact on the consistency and reliability of the process, and it will also greatly improve the effiency of the process.

The process is resilient in that it can retry steps when there are errors, it can pause for problem resolution and resume where it left off, or if desired, it can revert the entire environment back to its original state.

As an administrator, you create and verify a patched OracleHome archive with existing and familiar tools, and place the archive on each node that you want to upgrade. Then, a simple command like the one below will handle the rest.

rolloutOracleHome("Cluster1, Cluster2", "/pathTo/patchedOracleHome.jar", "/pathTo/backupOfUnpatchedOracleHome")

The way the process works is that we take advantage of existing clustering technology combined with an Oracle Traffic Director (OTD) load balancer, to allow us to take individual nodes offline one at a time to be updated. We communicate with the load balancer and instruct it to redirect requests to active nodes. We also created some advanced techiques for preserving active sessions so the end-user will never even know the patching is taking place.

We can leverage this same process for updating the Java version used by servers, and even for doing some upgrades to running applications, all without service downtime for the end-user.

There's a lot of exciting aspects to Zero Downtime (ZDT) Patching that we will be discussing here, so check back often!

For more information about Zero Downtime Patching, view the documentation.

WebLogic Server Multitenant Info Sources

In Will Lyons’s blog entry, he introduced the idea of WebLogic Server Multitenant, and there have been a few other blog entries related to WebLogic Server Multitenant since then. Besides these blogs and the product documentation, there are a couple of other things to take a look at:

I just posted a video on Youtube at https://youtu.be/C5GP_JB88VY This video includes a high-level introduction to WebLogic Server Multitenant. This video is a little bit longer than my other videos, but there are a lot of good things to talk about in WebLogic Server Multitenant.

We also have a datasheet at http://www.oracle.com/us/products/middleware/cloud-app-foundation/weblogic/weblogic-server-multitenant-ds-2742664.pdf, which includes a fair amount of detail regarding the value and usefulness of WebLogic Server Multitenant.

I’m at OpenWorld this week where we are seeing a lot of interest in these new features. One aspect of value that seems to keep coming up is the value of running on a shared platform. There are cases where every time a new environment is added, it needs to be certified against security requirements or standard operating procedures. By sharing a platform, those certifications only need to be done once for the environment. New applications deployed in pluggable partitions would not need a ground-up certification. This can mean faster roll-out times/faster time to market and reduced costs.

 That’s all for now. Keep your eye on this blog. More info coming soon!

About

The official blog for Oracle WebLogic Server fans and followers!

Stay Connected

Search

Archives
« February 2016
SunMonTueWedThuFriSat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
     
       
Today