Thursday Sep 24, 2015

Best Practices for EC Backups

Ops Center has a backup and recovery feature for the Enterprise Controller - you can save the current EC state as a backup file, and restore the EC to that state using the file. It's an important feature, but I've seen a few folks asking for guidelines about how to use it. Every site is different, but here are some broad guidelines that we recommend:

  • Perform a backup at least once a week, and keep at least two backup files.
  • Once you've made a backup file, store it offsite or on a NAS share - don't keep it locally on the EC.
  • You can use a cron job to automate regular backups. Here's a sample cron job to perform a backup:
    0 0 * * 0 /opt/SUNWxvmoc/bin/ecadm backup -o /bigdisk/oc_backups -l /bigdisk/oc_backups
  • Remember that some files and directories are not part of the EC backup for size reasons: isos, flars, firmware images, and Solaris 8-10 and Linux patches.
    Firmware images are automatically re-downloaded in Connected Mode. Isos and flars can be re-imported. You can also do separate backups of your Ops Center libraries via Netbackup or the like.

Some folks have also asked if there's a good way to test the backup and recovery procedure, to make sure it's working. Well, there's really only one way to do it - do an EC backup, and also backup or clone the file systems. Then, uninstall and reinstall the EC, restore from the backup, and make sure that everything looks right.

Take a look at the Backup and Recovery chapter for more information about how to perform a backup.

Thursday Sep 10, 2015

Updating the OCDoctor on a Managed System

There was a new feature introduced in version 4.38 of the OCDoctor script which has been causing some confusion, so I thought I'd explain it a bit.

Beginning with version 4.38, when you run the OCDoctor script with the --update option on a managed system, the OCDoctor script looks for a newer version on the Enterprise Controller, rather than using external download sites. In connected mode, the Enterprise Controller runs a recurring job to download the latest OCDoctor, which the managed systems can then reach.

This makes updates more feasible if you're in a dark site, and minimizes external connections in other sites. However, if you've downloaded the OCDoctor manually on the EC, you will need to place the OCDoctor zip file in the /var/opt/sun/xvm/images/os/others/ directory on the Enterprise Controller so that managed systems can download it.

Thursday Sep 03, 2015

Installing Ops Center in a Zone

I got a question recently about an Ops Center deployment:

"I'm looking at installing an Enterprise Controller, co-located Proxy Controller, and database inside an Oracle Solaris 11 Zone. Is this doable, and are there any special things I should do to make it work?"

You can install all of these components in an S11 zone. There are a few things that you should do beforehand:

-Limit the ZFS ARC cache size in the global zone. Without a limit, the ZFS ARC can consume memory that should be released. The recommended size of the ZFS ARC cache given in the Sizing and Performance guide is equal to (Physical memory - Enterprise Controller heap size - Database memory) x 70%. For example:

  # limit ZFS memory consumption, example (tune memory to your system):
  echo "set zfs:zfs_arc_max=1024989270" >>/etc/system
  echo "set rlim_fd_cur=1024" >>/etc/system
  # set Oracle DB FDs
  projmod -s -K "process.max-file-descriptor=(basic,1024,deny)" user.root

Make sure the global zone has enough swap space configured. The recommended swap space for an EC is twice the physical memory if the physical memory is less than 16 GB, or 16 GB otherwise. For example:

  volsize=$(zfs get -H -o value volsize rpool/swap)
  volsize=${volsize%G}
  volsize=${volsize%%.*}
  if (( $volsize < 16 )); then zfs set volsize=16G rpool/swap; \
  else echo "Swap size sufficient at: ${volsize}G"; fi
  zfs list

In the non-global zone that you're using for the install, set the ulimit:

  echo "ulimit -Sn 1024">>/etc/profile

Finally, run the OCDoctor to check the prerequisites before you install.

Thursday Aug 27, 2015

How Many Systems Can Ops Center Manage?

I saw a question about how many systems you can manage through Ops Center. This is an important question when you're planning a deployment, or looking at expanding an existing deployment.

In general terms, an Enterprise Controller can manage up to 3,000 assets. A Proxy Controller can manage between 350 and 450 assets, although you'll get better performance if there are fewer assets.

The Sizing and Performance guide has more detailed information about the requirements and sizing guidelines for Ops Center.

Thursday Aug 20, 2015

Editing or Disabling Analytics

There was a recent question thread about how you can tweak the OS analytics settings in Ops Center.

"Ops Center collects analytics data every 5 minutes and retains it for 5 days. Is it possible to edit these settings?"

You can edit the retention period but not the collection interval.

To edit the retention period, log into the UI. Click the Administration section, then click the Configuration tab for the EC, and select the Report Service subsystem.

The repsvc.daily-samples-retention-days property specifies the number of days to retain OS analytics data. You can edit this property, then restart the EC to make it take effect.

"Can I turn off data collection for OS analytics entirely?"

Yes, you can. Bear in mind that this requires you to edit a config file, so be very careful.

Go to the /opt/sun/n1gc/lib directory on the EC and find the XVM_SATELLITE.properties file. Edit it to uncomment this line:

#report.service.disable=true

Then, restart the Enterprise Controller.

Thursday Aug 13, 2015

Recovering After a Proxy Controller Crash

I saw a question recently about how to restore your environment if a remote Proxy Controller system fails. This is a good question, and there are a few facets to the answer, depending on your environment.

Recovery is easiest if you have recently backed up the Proxy Controller. The backup file includes asset data, so if you can restore the PC using the backup file, you should be golden.

If you don't have a backup of the Proxy Controller, it's going to take a bit more work. First, you have to migrate the dead PC's assets to a new PC. If you have automatic failover enabled, this happens automatically (hence the name); otherwise you can do it manually.

Then, you can install a new Proxy Controller (using the Linux or Solaris procedure), and migrate the assets to that PC.

Thursday Jul 30, 2015

Recovering LDoms From a Failed Server

I saw a recent question about Logical Domain recovery: If you have a control domain installed on a server and the server goes down with a hardware fault, what options do you have for recovering the logical domains from that control domain?

The answer is that you have options depending on how your environment is configured:

  • If you have the control domain in a server pool, and you have enabled automatic recovery on the LDom, you have the option of watching as the LDom is automatically brought back up on another control domain in the server pool.
  • If the control domain is in a server pool but you didn't enable automatic recovery, you can still manually migrate the guest by deleting the control domain asset. The guests will then be put in the Shutdown Guests list for the server pool, and you can bring it up on another control domain.
  • If you want to add the failed control back in, before you rediscover it and put it back in the server pool, you should log in and make sure that the guest OS isn't running, to avoid split brain issues.

    Take a look at the Recover Logical Domains from a Failed Server how-to for more information.

    Thursday Jul 23, 2015

    Kernel Zones support in 12.3

    One of the new features in Ops Center 12.3 is support for Oracle Solaris kernel zones. I wanted to talk a bit about this, because there are some caveats, and a new document to help you with using this type of zone.

    Kernel zones differ from other zones in that they have a separate kernel and OS from the global zone, making them more independent. In Ops Center 12.3, you can discover and manage kernel zones. However, you can't migrate them, put them in a server pool, or change their configuration through the user interface.

    We put together a how-to that explains how you can discover existing kernel zones in your environment. You can also take a look at the What's New doc for more information about what's changed in 12.3.

    Thursday Jul 16, 2015

    New Books in 12.3

    One of the changes that we've made in Ops Center 12.3 is a change to the documentation library. We've divided the old Feature Reference Guide up into several smaller books so that it's easier to use:

    • Configure Reference talks about how to get the software working - discovering assets; configuring libraries, networks, and storage; and managing jobs.
    • Operate Reference talks about incidents, reports, hardware management, and OS management, provisioning, and updating.
    • Virtualize Reference describes the use and management of Oracle Solaris Zones, Oracle VM Servers for SPARC, and server pools.
    • Oracle SuperCluster Operate Reference covers the management of Oracle SuperCluster.

    The What's New doc has more information about these new books. You can find the new books by clicking Feature Reference on the main doc site.

    Thursday Jul 09, 2015

    New Virtualization Icons

    There's a change in the UI that I wanted to talk about, since it's been confusing some people after they upgrade to version 12.3. The icons that represent the different virtualization types, such as Oracle Solaris Zones or Logical Domains, have changed. Here are the new icons:

    We made this change because there were getting to be a lot of supported virtualization types, particularly now that Kernel Zones are supported. The new icons make it easier to differentiate between different types so that you know at a glance what sort of system you're dealing with.

    The other new features in version 12.3 are discussed in the What's New document.

    Thursday Jun 25, 2015

    Ops Center 12.3 is Released

    Version 12.3 of Ops Center is now available!

    This is a major upgrade from the prior versions. In addition to bug fixes and performance enhancements, there are a number of new features:

    • Asset discovery refinements, including the ability to run discovery probes only on a specific network or from a specific Proxy Controller
    • Support for discovering and managing existing Oracle Solaris 11 Kernel Zones
    • The ability to create a custom Oracle Solaris 11 AI manifest and use it for provisioning
    • Refined search: Searching for one or more assets now displays the search results in a new tab in the Navigation pane, making navigation a bit easier
    • New and expanded books in the doc library

    Take a look at the What's New In This Release document for a more detailed breakdown of the new features, and the Upgrade guide for more information about upgrading to version 12.3.

    You can also take a look at the 12.3 documentation library here.

    Thursday Jun 11, 2015

    Providing Contact Info for ASR

    Ops Center includes a feature called Auto Service Request, which can automatically file service requests for managed hardware. However, I've seen a bit of confusion about how to get it running.

    First, the prereqs - to get ASR running, you need to be in connected mode, and you need to have a set of My Oracle Support (MOS) credentials entered in the Edit Authentications window. Your MOS credentials have to be associated with a customer service identifier (CSI) with rights over the hardware that you want to be enabled for ASR.

    Once you've got that, you'll click the Edit ASR Contact Information action in the Administration section. This opens a window where you specify the default contact information for your assets, which is used for all ASRs by default.

    If you have assets that need separate contact information, you can specify separate ASR contact information for an asset or a group of assets. That info is used in place of the default contact info.

    Finally, once you've got the contact info in the system, you click Enable ASR. This action launches a job to enable the assets for ASR, and it attempts to enable new assets for ASR when they're discovered. From then on, if a critical incident occurs on the hardware, ASR should create a service request for it.

    Take a look at the Auto Service Request chapter of the Admin Guide for more information.

    Thursday Jun 04, 2015

    Enterprise Controllers in Logical Domains

    I saw a few questions about installing Enterprise Controllers in Logical Domains, and what's possible with that sort of deployment. Here are some answers:

    "Is it supported to install the Enterprise Controller in a Logical Domain?"

    Yep. The Certified Systems Matrix lists the supported OSes for EC installation, and Oracle VM Server for SPARC is supported (as are some Oracle Solaris Zones).

    "Can you use Oracle Solaris Cluster to provide High Availability for an Enterprise Controller installed on a Logical Domain?"

    Yes, this is possible. It deserves its own post, so I'll go into more detail on it soon, but yes, it works.

    "If I have two Enterprise Controllers installed on Logical Domains, can I have EC 1 discover and manage the LDom for EC 2, and vice versa?"

    No. The Agent Controllers installed on EC and PC systems are different from standard Agents, and if you install an Agent from one EC on another EC's system, it's going to get confused.

    Thursday May 28, 2015

    Uploading and Deploying Oracle Solaris 11 Files

    I saw a question recently about uploading flat files, such as a config file, or tarballs to an Oracle Solaris 11 library and then deploy them to Oracle Solaris 11 servers. This is an easy task for Oracle Solaris 8, 9, or 10, but it's trickier to find with Oracle Solaris 11.

    Here are the steps to upload and deploy such files with Oracle Solaris 11 in Ops Center, using our software library for the content.

    1. Create an Oracle Solaris 11 pkg which contains the config files. Here's an example for how to do so: http://docs.oracle.com/cd/E23824_01/html/E21798/glcej.html
    2. Add that pkg to the repository. (The above example also covers this step.)
    3. Sync Ops Center with the repository so that the new pkg is added to Ops Center's catalog of software.
    4. Create an Ops Center Oracle Solaris 11 Profile that installs the pkg created in Step 1.
    5. Apply the profile in an update plan to the target systems.

    For more information about OS Profiles, see the OS Updates chapter.

    Thursday May 21, 2015

    Special Database Options

    When you're installing Ops Center, you have two options for the product database: You can use an embedded database, that's automatically installed on the Enterprise Controller and managed by Ops Center, or you can use a remote database that you manage yourself.

    With regards to the customer-managed database, I saw an important question recently: When you install this database, do you have to enable any of the advanced or special features? Some folks want to use the bare minimum installation for security reasons.

    The answer here is that Ops Center only requires the base installation; no special features are used. As long as you're using one of the DB versions listed in the Certified Systems Matrix, you're golden.

    About

    This blog discusses issues encountered in Ops Center and highlights the ways in which the documentation can help you

    Search

    Archives
    « February 2016
    SunMonTueWedThuFriSat
     
    1
    2
    3
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
         
           
    Today