Tuesday Feb 26, 2013

Performance Tuning an Exalogic System

source

I tend to get annoyed at my engineering pals for designing performance into automobiles such as the Chevy Corvette, instead of letting the driver feel the satisfaction of increasing performance by improving his or her technique. Many sysadmins feel the same about their craft. But as the story of Paul Bunyan demonstrates, we must adapt or die.

In a previous post I discussed how Exalogic changes the way you handle provisioning. In this post, I'll focus on the way Exalogic changes the way you handle performance tuning. First, the optimizations that are already done for you, then the optimizations you can still perform yourself.

Performance Optimizations Designed Into Exalogic

Because Oracle engineering knows the exact details of the environment in which each component is operating, Oracle has configured Exalogic components to use the internal network, memory, and storage for optimum performance, availability and security. It employs two types of optimizations:

Generic Optimizations (Exabus)

These optimizations will benefit any software running on the Exalogic machine, whether Oracle or 3rd party, in physical or virtual environments. The collection of Exalogic–specific optimizations are referred to as Exabus. The purpose of Exabus is primarily to integrate Infiniband networking seamlessly into all the hardware, software, and firmware distributed throughout the system. Examples include:

  • Changes to the firmware and drivers in the network switches that increase performance by skipping protocol stack conversions
  • Use of Exalogic solid state disk caching to increase the speed and capacity of local (shared) data read and write operations, such as JMS queues and run time metadata.
  • Built in high availability at network and storage levels
  • Native Infiniband integration with any other engineered systems, such as additional Exalogic machines, ZFS storage appliances, or Exadata Database machines.
  • The ability to define Infiniband partitions, which ensure application isolation and security.

Optimizations to Run-Time Components

Oracle has engineered optimizations for Exalogic directly into Oracle WebLogic Server (WLS), Coherence, and Tuxedo. They benefit any application running on those software components, but they can only be activated on the Exalogic platform. They address performance limitations that only become apparent when the software is running on Exalogic's high-density computing nodes and very fast Infiniband switches. Examples include:

  • WebLogic Server session replication uses the SDP layer of IB networking to maximize performance of large scale data operations. This avoids some of the typical TCP/IP network processing overhead.
  • Cluster communication has been redesigned in Coherence to further minimize network latency when processing data sets across caches. Its elastic data feature increases performance by minimizing network and memory use in both RAM and garbage collection processing.
  • Tuxedo has been similarly enhanced to make increasing use of SDP and RDMA protocols in order to optimize the performance of inter–process communications within and between compute nodes.

Tuning You Can Perform on Exalogic

Benchmarks and other tests show that applications that run well on Oracle middleware will run better on Exalogic. The degree to which they run better will be affected by how well optimised they are to take advantage of the Exalogic system, as well how well the Exalogic components are set up to balance resources.

However, if your workloads or configurations change, you may need to tune your Exalogic. Here are some general notes, extracted from the Exalogic: Administration Tasks and Tools white paper.

Tuning the Middleware

At the middleware and application level most of the standard options and techniques are available to you. WebLogic Server, JRockit, Coherence and iAS, etc. operate as they do on traditional platforms.

As for the rest of the Exalogic platform, Oracle's recommendation is: leave it alone.

Tuning The Platform

Exalogic manages itself, so you don't need adjust it unless you are sure that something needs changing. This is a major change in approach, since you are used to spending considerable time tweaking your systems to accommodate the needs of different groups. Knowing exactly when and how much (or how little) to tune an Exalogic system is a big topic, but here are some general guidelines.

  • Because Exalogic has such a high density of compute resources across such a fast network, small configuration changes can have a large impact.
  • Try out your changes in a test environment, first. Make sure its resources, configurations, and workload match those of your production system as closely as possible. Oracle Application Replay is a good tool for assessing the impact of configuration and infrastructure changes on the performance of your applications. Give it a try.
  • Focus on reducing response times for users and applications. If response time is not a problem, you probably don't have an issue to resolve, regardless of internal alerts and indicators you may be noticing.
  • Capture the right performance baselines ahead of time so you can compare the results of your tuning to them.

Tuning the Infrastructure

Storage, Infiniband, and OS are set up during initial configuration, so further tuning is not usually needed. If you need to review the kernel settings, network bonding, and MTU values, or perhaps the NFS settings, use Enterprise Manager. Finding the optimum changes tends to be an iterative process that varies with application workload.

Tuning the Middleware Runtime Environment

Ensure that Exalogic optimizations for WLS Suite are switched on (see MOS note 1373571.1), since they affect replication channels, packet sizes, and the use of the SDP protocol in the Infiniband networks.

Oracle Traffic Director is currently a unique feature of Exalogic, so is not available on other platforms. You can alter traffic routing rules for each application at any time. As workloads change and grow this is likely to be a key tuning task.

Tuning the Applications

At present you can tune business applications just as you would on traditional platforms. One possible side effect of running your business applications on Exalogic is that its enhanced performance may unmask poorly tuned applications or poorly written customizations.

For More Information

For more information, read the Exalogic: Administration Tasks and Tools white paper.

- Rick

Follow me on:
Blog | Facebook | Twitter | YouTube | The Great Peruvian Novel

Friday Feb 22, 2013

How to Configure the Linux Kernel's Out of Memory Killer

source

Operating systems sometimes behave like airlines. Since the airlines know that a certain percentage of the passengers won't show up for their flight, they overbook the flights. As anyone who has been to an airport in the last 10 years knows, they usually get it wrong and have to bribe some of us to get on the next flight. If the next flight is the next morning, we get to stay in a nice hotel and have a great meal, courtesy of the airline.

That's going to be my lodging strategy if I'm ever homeless.

Linux kernel does something similar. It allocates memory to its processes ahead of time. Since it knows that most of the processes won't use all the memory allocated to them, it over-commits. In other words, it allocates a sum total of memory that is more than it actually has. Once in a while too many processes claim the memory that the kernel promised them at the same time. When that happens, the Linux kernel resorts to an option that the airlines wish they had: it kills off processes one at a time. In fact, it actually has a name for this function: the out-of-memory killer.

Robert Chase explains.

How to Configure the Out of Memory Killer

Robert Chase describes how to examine your syslog and how to use the vmstat command for clues about which processes were killed, and why. He then shows you how to configure the OOM killer to behave the way you prefer. For instance, you can make certain processes less likely to be killed than others. Or more. Or you can instruct the kernel to reboot instead of killing processes.

More Oracle Linux Resources

- Rick

Follow me on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!)

Thursday Feb 21, 2013

Can You Figure Out Which Teenager Took the Cash?

source

Dads like me are familiar with a phenomenon known as Silent Dollar Disappearance. This tends to occur when there is a confluence of money in your wallet and teenage children in your home. You never actually see it happen, but if you are paying attention, you might detect that it has happened. As when, for instance, you try to pay for beer and brats at the grocer. It becomes difficult to know for sure whether it was the teenagers. What if you already spent the money on something else? That's what my teenage daughters always said. Or perhaps you had a wallet malfunction, and it flew out. So difficult to pin-point the actual cause.

Linux, like any OS, is vulnerable to a similar phenomenon. It's called silent data corruption. It can be caused by faulty components, such as memory modules or storage systems. It can also be caused by -God forbid- administrative error. As with Silent Dollar Disappearance, it's difficult to detect when data corruption is actually happening. Or what the exact cause was. But, as with Dads and teenagers, you eventually figure out that it has happened.

It may be impossible to identify the culprit after the data has been corrupted, but it's not impossible to stop the culprit ahead of time. Oracle partnered with EMC and Emulex to do just that. And they were kind enough to explain how the did it and how you can benefit. In this article:

Preventing Silent Data Corruption in Oracle Linux

An excerpt ...

"Data integrity protection is not new. ECC and CRC are available on most, if not all, servers, storage arrays, and Fibre Channel host bus adapters (HBAs). But these checks protect the data only temporarily within a single component. They do not ensure that the data you intended to write does not become corrupt as it travels down the data path from the application running in the server to the HBA, the switch, the storage array, and then the physical disk drive. When data corruption occurs, most applications are unaware that the data that was stored on the disk is not the data that was intended to be stored.

"Over the last several years, EMC, Emulex, and Oracle have worked together to drive and implement the Protection Information additions to the T10 SBC standard, which enables the validation of data as it moves through the data path to ensure that silent data corruption does not occur."

Interesting stuff. Give it a read.

- Rick

Follow us on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!)

Tuesday Feb 19, 2013

Provisioning Oracle Exalogic: What's Involved

source

In this interview from 2012, Marshall Choy explains to dear old Justin how Oracle's engineered systems and optimized solutions will impact the job of a sysadmin.

I was just reading a recently published Oracle White Paper that goes into a little more detail...

"While the core middleware or applications administration role is largely the same as for non-Exalogic environments, significantly less work is required to manage storage, OS, and networks. In addition, some administration tasks are simplified."

That sounded interesting, so I kept reading. Here is an excerpt of what it says about provisioning.

Provisioning New Environments

Provisioning is done so frequently in some organizations that it's almost a continuous effort. Exalogic was designed as a multi-tenant environment in which many applications and user communities can operate in secure isolation, but all running on a shared compute infrastructure. As a result, provisioning environments for development, testing or other projects is simply a case of re-configuring these existing shared resources. And it takes hours rather than weeks.

The typical steps involved are:

  1. Storage – using the ZFS BUI
    1. Create NFS v4 shares
    2. Define Access Control List
  2. Compute nodes – via standard OS commands
    1. Decide which nodes are to be used for this project. In the current Exalogic X3-2 machines each node has 16 processing cores and 256 GB RAM. For each node:
      1. Create the root OS user, if it does not already exist.
      2. Add a mount point entry for the shared storage to the /etc/fstab file and issue the mount command to enable access to it from the compute node.
  3. Network – using the Exalogic IB subnet manager
    1. Identify IP addresses for the compute nodes to be used. Add any new virtual IP addresses to be used to ensure middleware high availability.
    2. Define new virtual network interfaces (VNICs) to enable connections to Exalogic from the rest of the data Center.
    3. Associate the pre-set external facing IP addresses to the VNICs.
    4. Define Exalogic Infiniband partitions to create secure groups of compute nodes / processors.

No physical cabling is required as network configuration is defined at the software level. In the event of a major failure, however, you may need to re-image the OS on some or even all compute nodes as a faster alternative to restoring from backup.

This whole process should take no more than an hour, after which a new, fully functioning compute platform is available for the project. It does not require any other data Center resources.

Further details are available in the Exalogic Enterprise Deployment Guide

I'll keep reading it and sharing some nuggets here. See the entire paper.

- Rick

Follow us on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!)

Monday Feb 18, 2013

Three Oracle VM Hands-On Labs On OTN

source

We put the hands-on labs from the virtualization track of the OTN Virtual Sysadmin Days on OTN.

Lab 1 - Deploying an IaaS Environment with Oracle VM

Planning and deployment of an infrastructure as a service (IaaS) environment with Oracle VM as the foundation. Storage capacity planning, LUN creation, network bandwidth planning, and best practices for designing and streamlining the environment so that it's easy to manage.

Lab 2 - How to Virtualize and Deploy Oracle Applications in Minutes with Oracle VM

How to deploy Oracle applications in minutes with Oracle VM Templates. Find out what Oracle VM Templates are and how they work. Deploy an actual Oracle VM Template for an Oracle application. Plan your deployment to streamline ongoing updates and upgrades.

Lab 3 - Deploying a Cloud Infrastructure with Oracle VM 3.x and the Sun ZFS Storage Appliance

This hands-on lab will demonstrate what Oracle’s enterprise cloud infrastructure for x86 can do, and how it works with Oracle VM 3.x. How to create VMs. How to migrate VMs. How to deploy Oracle applications quickly and easily with Oracle VM Templates. How to use the Storage Connect plug-in for the Sun ZFS Storage Appliance.

By the way, the picture of that ranch in Colorado was taken by my good friend
Mike Schmitz. See more of his photography here. Follow it on Facebook here.

- Rick

Follow us on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!)

Friday Feb 15, 2013

Sysadmins Rejoice! OVM 3.2.1 Includes a Full-Featured CLI

Remember this famous scene from English History? The French accent of the castle guard was so thick I couldn't understand him, but I think that at one point he said "I spit on your graphical user interface." Proof that sysadmins were alive and well in the time of King Arthur.

CLI Documentation

Sysadmins will have cause to taunt English royalty a second time because the command line interface (CLI) of the recently released Oracle VM 3.2.1 has been expanded to include all the capabilities of the (ptui!) graphical user interface (GUI). That means scripts. Boo-yah! It supports public-key authentication, too. Find docs here.

Other Cool Stuff

Oracle VM Manager used to manage only your x86 virtual machines. Now it manages your SPARC systems, too. Create server pools, create virtual machines, and manage networking and storage in the same way, using the same tool. Details here.

You can use MySQL as your backend repository. Just use the Simple installation, which will locally install the default MySQL database that is packaged with the Oracle VM Manager installer. Details here.

You can install the osv-support-tools meta-package for easier integration with Oracle support tools. (sudo is now part of osv-support-tools.) Details here.

More Resources

- Rick

Follow us on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!

Monday Feb 11, 2013

Oracle Solaris 10 Still Rocks

source

When it was launched back in 05, Oracle Solaris 10 rocked the IT world. I heard a rumor that Scott tried to launch it at a Rolling Stones concert, but apparently Mick Jagger didn't think operating systems were sexy.

Operating systems not sexy? Since when?

Well, Mick, when was the last time you released a new album? Oracle Solaris 10 released one last Friday, pal.

Oracle Solaris 10 1/13 Release

The new release is integrated with My Oracle Support. As a result, you can view the system configuration, asset inventory, and change history of your Solaris systems on the support portal, along with the results of the health checks that Oracle Support performs. (Kinda like letting a pregnant woman have access to continuous ultrasound via her cell phone, huh?)

This support will be available for Oracle Solaris 10 through 2018. After that, it will be supported through Oracle's Lifetime Support Policy.

There's plenty more:

Technical Resources

Thursday Feb 07, 2013

Five Perspectives on Virtual Networks

source

At about the time I finally understood server virtualization, they hit me with network virtualization. Or was it virtualized networks? Virtual networking? So, did that mean that you networked your virtual environments together? Or did it mean that you created a virtual network? A virtual network of virtual servers? Or physical servers?

I did what any techie would do when confronted with a conundrum: I played video games until 2:00 am. Then it came to me: a virtual network is simply a physical network sliced into multiple virtual networks. That wasn't so hard. In fact, we currently provide two ways to create a virtual network: within the OS and at the hypervisor. Shoot, you can even pretend to create a virtual network by firing up VirtualBox. To help you decide which type of network virtualization to use, we put together a few perspectives:

How Networking Works in VirtualBox

by the Fat Bloke

Start here, just in case you want to become familiar with virtual networks to avoid bringing down your entire data center. The Fat Bloke describes how to set up your virtual networks inside VirtualBox and configure them so the physical networks understand what you're trying to do. He covers Network Address Translation (NAT), bridged networking, internal networking, host-only networking, and NAT with Port-forwarding.

Evaluating Oracle Solaris 11 from Inside Oracle VM VirtualBox

by Yuli Vasiliev

Now you can horse around a little bit with the Oracle Solaris virtual network goodies. Yuli Vasiliev explains how to import an Oracle Solaris 11 image into VirtualBox, how to configure the virtual machine settings, and how to explore virtual networking at the OS layer, among other things.

Looking Under the Hood at Networking in Oracle VM Server for x86

by Greg King and Suzanne Zorn

Now you're ready to take a closer look at virtual networking in the hypervisor; specifically, Oracle VM Server for x86. Greg King and Suzanne Zorn describe how you can create logical networks out of physical Ethernet ports, bonded ports, VLAN segments, virtual MAC addresses (VNICs), and network channels. And how to assign channels (or "roles") to each logical network so that it handles the type of traffic you want it to. Very cool read + additional resources.

Which Tool Should I Use to Manage Which Virtualization Technology?

by Ginny Henningsen

Now that you have a better understanding of each method, it's only natural to wonder which tools to use, right? Ginny Henningsen provides an overview of the interfaces and tools that you can use to set up and manage virtual network resources, among other things..

Network Virtualization and Network Resource Management

by Detlef Drewanz

And, if you want to take it a step further, consider adding resource management to your virtual network picture. This article describes what's involved in managing network resources in conjunction with hypervisors, containers, and zones in an internal virtual network.

Let me know if you'd like any more info about virtual networks. We've got a bunch.

Follow us on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!

Tuesday Feb 05, 2013

Do YOU Know Where Your Data Has Been?

When you get change at the grocery store, you just don’t know where it’s been. (Image removed from blog.) And frankly, I don’t want to know, but wherever it’s been, it’s been in different environments with different wear-and-tear. If you try to re-use those dollar bills in a vending machine, you might get your candy bar. Or you might not, if the vending machine says your money is unreadable.

You get a less icky feeling about where your transportable storage has been, that is, until data you were expecting is as unreadable as that old dollar bill. Unfortunately, there is no native data integrity checking as data moves across storage landscapes. However the Oracle T10000C Data Integrity Validation (DIV) feature uses hardware-assisted CRC checks to not only help ensure the data is written correctly the first time, but also does so much more efficiently.

Data at rest is generally not an issue for any storage platform. In tape drives, data is protected with read after write verification as it is written, and Error Correction Code (ECC) is added to ensure data recovery once it is on the medium. In addition, a typical tape drive adds Cyclic Redundancy Code (CRC) protection, as soon as a record is received. This ensures the record does not get corrupted while moving between internal memories. Checking the CRC, though, is a time-consuming process that moves through the following steps:

  1. File pulled from disk to be stored on tape
  2. 256-bit CRC generated and stored in a catalog on a server
  3. File sent to tape drive without the CRC and written to a tape cartridge
  4. Upon recall, the file is called from a tape and sent to a server via the tape drive
  5. 256-bit CRC recreated and compared to catalog in the server

This process takes a minimum of 25 seconds to check the CRC on a 4 GB file, assuming a 2:1 compression ratio and a reasonable server workload. If the tape drives were allowed to assist in some of this workload, the processing time could be dramatically reduced. That’s the premise of the Oracle T10000C DIV feature’s hardware-assisted CRC check. The amount of reduction is simply dependent on the amount of trust the user places in the tape drive itself. While a basic model produces a slightly quicker process, the Oracle T10000C DIV process guarantees it will be done efficiently as shown in the table below.

Steps CRC Verification Model #1 Oracle T10000C Verification Model
1 File pulled from disk to be stored on tape File system sends SCSI Verify Command from server
2 32 bit CRC generated and stored with each record on server Tape drive receives command
3 file sent to tape drive - drive checks CRC File and CRC written to tape
4 File and CRC written to tape Upon recall, file and CRC called from tape to be read
5 Upon recall, file and CRC called from tape to be read Tape drive checks the 32-bit CRC
6 File and CRC checked in tape drive SCSI Verify command and status returned to server
7 32 bit CRC re-created and checked in hardware (Intel)  
Time MINIMUM 14 seconds to check the CRC on 4 GB file (2:1 compression ration) MAXIMUM 9 seconds to verify the CRC on 4 GB file (2:1 compression ratio) independent of server workload

Obviously, built-in-the-drive, end-to-end integrity checking can be much less resource intensive than having to read an entire file to verify that it is still good. Any 32-bit CRC check can be done as specified in ANSI X3.139. This is the same CRC used in the Fibre Channel Protocol and the Fiber Distributed Data Interface (FDDI) for optical transmissions. As a result, the generation polynomial is readily available. While this is a standard interface CRC, it is important to note that this check can be performed outside the interface protocol. In addition, the drive also can generate and use a CRC in the Intel CRC32c format.

Supporting hardware-assisted CRC checking can be as simple as sending a specified SCSI mode select command to turn on the checking. When the Oracle T10000C drive is in its DIV mode, the last 32 bits of any record are treated as a CRC and used to check the integrity of each record. If the CRC check fails, a write error is reported to allow the application to resend the record. A bad record will never be written to tape. If the CRC is correct, that CRC is stored with the record on tape and checked every time the record is read. All of this is done with zero performance loss on the tape drive. If a deferred write error has been reported to the application, the application can determine which record was in error using multiple methods. The recovery is completed when the application resends the previously failed record and the remainder of the data records.

If the drive is being utilized with CRC checks during a subsequent read operation, the CRC will be appended to the record. Verification of the file’s data integrity then is completed with a read verification. In other words, when a drive reads data having a CRC stored along with a record, it will output the CRC appended to the record. This allows the application or driver to perform its own data integrity checks to ensure, months or even years after recording, that the data has not been corrupted. The Intel CRC32c format allows very fast CRC processing and checking by the application. The user application, or driver, can use hardware-assisted CRC checks as follows:

  • Write with hardware-assisted CRC checks and read with hardware-assisted CRC checks
  • Write with hardware-assisted CRC checks and read in normal mode
  • Write in normal mode and read in hardware-assisted CRC checks mode (Note: In this case, the read CRC, which is generated by the drive on the fly, was not stored on tape.)

Another advantage of writing a tape in hardware-assisted CRC mode is the ability of the tape drive to use the Verify command to check an individual record, one file, multiple files, or the entire tape, without having to send all the data to the application to verify the validity of that data. This can be done because the hardware-assisted CRC is recorded on the tape with each record, and the tape drive has the ability to verify each record with that CRC. Because it is only 32 bits, checking only the CRC saves valuable processing resources and time. Ultimately, hardware-assisted CRC checking can have the following options:

  • Verify any record (up to 2MB)
  • Verify entire file (collection of 2MB records)
  • Verify N number of files
  • Verify N number of files of variable record size
  • Verify entire tape with one command
  • Verify mixed mode tape (hardware-assisted CRC check records and non-hardware-assisted CRC check records)
    • A hardware-assisted CRC check check is not made on non-hardware-assisted CRC check records
    • The drive must be in the correct DIV mode for the records it is verifying

- Brian Zents

Follow the OTN Garage:
Blog | Facebook | Twitter | YouTube

About

Contributors:
Rick Ramsey
Kemer Thomson
and members of the OTN community

Search

Archives
« February 2013 »
SunMonTueWedThuFriSat
     
1
2
3
4
6
8
9
10
12
13
14
16
17
20
23
24
25
27
28
  
       
Today
Blogs We Like