Thursday Nov 12, 2015

Updating an Oracle Solaris 8 system

Last week, we talked a bit about how to update branded zones. I saw a question this week about upgrading Oracle Solaris 8 systems, and branded zones figure into the answer here too.

"I've got some legacy applications running on some Oracle Solaris 8 systems. I'd like to get them running on S11 if possible. What can I do to upgrade these systems?"

Solaris 8 is old enough that there isn't an upgrade path all the way to Oracle Solaris 11. If you have legacy applications running on Solaris 8, and you want to move to a newer OS, you have a couple of options.

If you can move your applications onto S11, then you can use Ops Center to provision new S11 systems and then get the applications running on the new system.

However, some applications might not work on S11. In that case, you could create a Flash Archive of the existing S8 system, and use that FLAR to make an S8 branded zone on an S10 system. It's not S11, but it's not too shabby.

Thursday Nov 05, 2015

Branded Zones Questions

Branded Zones are handy if you want to run Oracle Solaris 10 zones on top of an S11 platform. However, they do work a bit differently from other zones in Ops Center. I've received a couple of questions about branded zones that I thought I'd try to clear up:

"I have a Control Domain with an Oracle Solaris 10 branded zone on it. Can I upgrade the control domain without Ops Center getting confused by the branded zone?"

Yes. When you upgrade a control domain or OS with a branded zone on it, the branded zone is skipped by design.

"Okay, so how do I upgrade the branded zone itself?"

Patching on a branded zone is similar to patching a global zone. If the branded zone doesn't have an agent, you can switch to agent management using the Switch Management Access option, then patch the branded zone normally.

Thursday Oct 29, 2015

Added Hardware and OS Support in Ops Center 12.3

There's some new support that's just been added to Ops Center 12.3.0. SPARC M7 and T7 servers are now officially certified, and Oracle Solaris 11.3 is certified, both as a managed OS and for the EC and PC systems.

There are some new and updated docs to go along with this support, as well. There are two revised how-tos that explain how to discover and manage these servers:

There are also two new how-tos, which aren't directly related to SPARC M7 or T7 but which you might find useful regardless:

Take a look at the What's New document for more information.

Thursday Oct 22, 2015

Ops Center Communication Issue with Some Java Versions

Some Ops Center users have run into an issue recently with certain Java versions on managed assets. Basically, if you upgrade to one of the problematic versions, it can disrupt communication between the different parts of Ops Center, causing assets to show as unreachable and jobs to fail.

The problem children are:

  • Java SE JDK and JRE 6 Update 101 or later

  • Java SE JDK and JRE 7 Update 85 or later

  • Java SE JDK and JRE 8 Update 51 or later

There's a simple workaround for this issue, which you can apply before or after you've hit the issue, which uses the OCDoctor script (by default, it's in the /var/opt/sun/xvm/OCDoctor/ directory).

On the EC, run the OCDoctor script with the --update option (the fix requires version 4.51 or later). Then run it again with the --troubleshoot and --fix options.

If you're using JDK 6, you then need to download a new Agent Controller bundle based on the version of Ops Center that you're using:

Finally, use the same OCDoctor options (--update, then --troubleshoot --fix) on the Proxy Controllers. This will either clear up the issue, or prevent it from appearing at all.

Thursday Oct 15, 2015

Checking the Health of an Ops Center System

I saw a general question about keeping Ops Center running:

"I want to do a regular diagnostic to make sure that my Enterprise Controller and Proxy Controller systems are healthy. What tools can I use to do that?"

There are a few tools that you can use for this purpose.

First, there's the UI. The Administration section in the UI shows the status of all Proxy Controllers, and the current status of the Enterprise Controller services.

Outside of the UI, there are a number of tools bundled with the OCDoctor script that you can use. The OCDoctor and its toolbox are in the /var/opt/sun/xvm/OCDoctor/ folder on the Enterprise Controller and Proxy Controllers. Here are some of the options and tools you can use:

  • --update  - You should run this first, to make sure that you've got the latest version of the OCDoctor.
  • --troubleshoot  - This option will provide troubleshooting information for some known issues. You can also add the --fix option to apply fixes for some of these issues.
  • --check-connectivity  - This option will check for connectivity issues.
  • toolbox/ -b  - This script checks for issues with your libraries.

These options should help you stay aware of how Ops Center is doing.

Thursday Oct 08, 2015

Agent versus Agentless Management

I've seen a couple of questions recently about the differences between Agent managed and Agentless assets, so I thought I'd explain the differences and the relative merits.

You can manage operating systems and virtualization technologies in one of two ways - by installing an Ops Center Agent on them, or by providing Ops Center with credentials that it can use to reach the asset. The agent is tailored to the system - there are separate types of Agents for Zones and LDoms.

If you don't want to have anything else installed on a system, agentless management can provide management and monitoring capabilities. However, there are some features that aren't available for assets that are managed agentlessly. There's a table in the OS Management chapter that explains what features are and are not available with agentless assets.

Thursday Oct 01, 2015

Changing an Asset's Name

I got a question about an incorrect asset name:

"I discovered a server, but when I discovered it, it was named according to its IP address because of a DNS issue. The incorrect name in the UI angers me. How do I fix it?"

This is relatively simple. In the navigation section, select the All Assets node at the root of the asset tree, then select the Managed Assets tab. Select the incorrectly named asset, and click the pencil icon to edit its properties (including name). Click finish to save the changes.

Thursday Sep 24, 2015

Best Practices for EC Backups

Ops Center has a backup and recovery feature for the Enterprise Controller - you can save the current EC state as a backup file, and restore the EC to that state using the file. It's an important feature, but I've seen a few folks asking for guidelines about how to use it. Every site is different, but here are some broad guidelines that we recommend:

  • Perform a backup at least once a week, and keep at least two backup files.
  • Once you've made a backup file, store it offsite or on a NAS share - don't keep it locally on the EC.
  • You can use a cron job to automate regular backups. Here's a sample cron job to perform a backup:
    0 0 * * 0 /opt/SUNWxvmoc/bin/ecadm backup -o /bigdisk/oc_backups -l /bigdisk/oc_backups
  • Remember that some files and directories are not part of the EC backup for size reasons: isos, flars, firmware images, and Solaris 8-10 and Linux patches.
    Firmware images are automatically re-downloaded in Connected Mode. Isos and flars can be re-imported. You can also do separate backups of your Ops Center libraries via Netbackup or the like.

Some folks have also asked if there's a good way to test the backup and recovery procedure, to make sure it's working. Well, there's really only one way to do it - do an EC backup, and also backup or clone the file systems. Then, uninstall and reinstall the EC, restore from the backup, and make sure that everything looks right.

Take a look at the Backup and Recovery chapter for more information about how to perform a backup.

Thursday Sep 10, 2015

Updating the OCDoctor on a Managed System

There was a new feature introduced in version 4.38 of the OCDoctor script which has been causing some confusion, so I thought I'd explain it a bit.

Beginning with version 4.38, when you run the OCDoctor script with the --update option on a managed system, the OCDoctor script looks for a newer version on the Enterprise Controller, rather than using external download sites. In connected mode, the Enterprise Controller runs a recurring job to download the latest OCDoctor, which the managed systems can then reach.

This makes updates more feasible if you're in a dark site, and minimizes external connections in other sites. However, if you've downloaded the OCDoctor manually on the EC, you will need to place the OCDoctor zip file in the /var/opt/sun/xvm/images/os/others/ directory on the Enterprise Controller so that managed systems can download it.

Thursday Sep 03, 2015

Installing Ops Center in a Zone

I got a question recently about an Ops Center deployment:

"I'm looking at installing an Enterprise Controller, co-located Proxy Controller, and database inside an Oracle Solaris 11 Zone. Is this doable, and are there any special things I should do to make it work?"

You can install all of these components in an S11 zone. There are a few things that you should do beforehand:

-Limit the ZFS ARC cache size in the global zone. Without a limit, the ZFS ARC can consume memory that should be released. The recommended size of the ZFS ARC cache given in the Sizing and Performance guide is equal to (Physical memory - Enterprise Controller heap size - Database memory) x 70%. For example:

  # limit ZFS memory consumption, example (tune memory to your system):
  echo "set zfs:zfs_arc_max=1024989270" >>/etc/system
  echo "set rlim_fd_cur=1024" >>/etc/system
  # set Oracle DB FDs
  projmod -s -K "process.max-file-descriptor=(basic,1024,deny)" user.root

Make sure the global zone has enough swap space configured. The recommended swap space for an EC is twice the physical memory if the physical memory is less than 16 GB, or 16 GB otherwise. For example:

  volsize=$(zfs get -H -o value volsize rpool/swap)
  if (( $volsize < 16 )); then zfs set volsize=16G rpool/swap; \
  else echo "Swap size sufficient at: ${volsize}G"; fi
  zfs list

In the non-global zone that you're using for the install, set the ulimit:

  echo "ulimit -Sn 1024">>/etc/profile

Finally, run the OCDoctor to check the prerequisites before you install.

Thursday Aug 27, 2015

How Many Systems Can Ops Center Manage?

I saw a question about how many systems you can manage through Ops Center. This is an important question when you're planning a deployment, or looking at expanding an existing deployment.

In general terms, an Enterprise Controller can manage up to 3,000 assets. A Proxy Controller can manage between 350 and 450 assets, although you'll get better performance if there are fewer assets.

The Sizing and Performance guide has more detailed information about the requirements and sizing guidelines for Ops Center.

Thursday Aug 20, 2015

Editing or Disabling Analytics

There was a recent question thread about how you can tweak the OS analytics settings in Ops Center.

"Ops Center collects analytics data every 5 minutes and retains it for 5 days. Is it possible to edit these settings?"

You can edit the retention period but not the collection interval.

To edit the retention period, log into the UI. Click the Administration section, then click the Configuration tab for the EC, and select the Report Service subsystem.

The repsvc.daily-samples-retention-days property specifies the number of days to retain OS analytics data. You can edit this property, then restart the EC to make it take effect.

"Can I turn off data collection for OS analytics entirely?"

Yes, you can. Bear in mind that this requires you to edit a config file, so be very careful.

Go to the /opt/sun/n1gc/lib directory on the EC and find the file. Edit it to uncomment this line:


Then, restart the Enterprise Controller.

Thursday Aug 13, 2015

Recovering After a Proxy Controller Crash

I saw a question recently about how to restore your environment if a remote Proxy Controller system fails. This is a good question, and there are a few facets to the answer, depending on your environment.

Recovery is easiest if you have recently backed up the Proxy Controller. The backup file includes asset data, so if you can restore the PC using the backup file, you should be golden.

If you don't have a backup of the Proxy Controller, it's going to take a bit more work. First, you have to migrate the dead PC's assets to a new PC. If you have automatic failover enabled, this happens automatically (hence the name); otherwise you can do it manually.

Then, you can install a new Proxy Controller (using the Linux or Solaris procedure), and migrate the assets to that PC.

Thursday Jul 30, 2015

Recovering LDoms From a Failed Server

I saw a recent question about Logical Domain recovery: If you have a control domain installed on a server and the server goes down with a hardware fault, what options do you have for recovering the logical domains from that control domain?

The answer is that you have options depending on how your environment is configured:

  • If you have the control domain in a server pool, and you have enabled automatic recovery on the LDom, you have the option of watching as the LDom is automatically brought back up on another control domain in the server pool.
  • If the control domain is in a server pool but you didn't enable automatic recovery, you can still manually migrate the guest by deleting the control domain asset. The guests will then be put in the Shutdown Guests list for the server pool, and you can bring it up on another control domain.
  • If you want to add the failed control back in, before you rediscover it and put it back in the server pool, you should log in and make sure that the guest OS isn't running, to avoid split brain issues.

    Take a look at the Recover Logical Domains from a Failed Server how-to for more information.

    Thursday Jul 23, 2015

    Kernel Zones support in 12.3

    One of the new features in Ops Center 12.3 is support for Oracle Solaris kernel zones. I wanted to talk a bit about this, because there are some caveats, and a new document to help you with using this type of zone.

    Kernel zones differ from other zones in that they have a separate kernel and OS from the global zone, making them more independent. In Ops Center 12.3, you can discover and manage kernel zones. However, you can't migrate them, put them in a server pool, or change their configuration through the user interface.

    We put together a how-to that explains how you can discover existing kernel zones in your environment. You can also take a look at the What's New doc for more information about what's changed in 12.3.


    This blog discusses issues encountered in Ops Center and highlights the ways in which the documentation can help you


    « November 2015