Simplify your maintenance activities with topology-aware Maintenance Windows in Stack Monitoring. Stack Monitoring is an Oracle Cloud Infrastructure (OCI) service that provides monitoring for applications and their underlying technology stack. When performing maintenance activities on resources, it is desirable to temporarily suppress alarm notifications on these resources to prevent alarm noise. The task of determining the appropriate set of relevant alarm rules for all resources in the maintenance window can be complicated and tedious. Eliminate this complexity by using OCI Stack Monitoring’s new Maintenance Windows. Maintenance Windows can suppress alarm notifications for an entire fleet of hosts and all resources on those hosts or suppress all alarm notifications for an entire application stack (application servers, databases, hosts, storage, load balancers, etc.) in just a few clicks.
Stack Monitoring’s Maintenance Windows UI provides you with a one-stop shop to manage alarm suppressions. From this UI, you can schedule, start, and stop a one-time or repeating Maintenance Window. Additionally, you can view all active, scheduled, and completed Maintenance Windows.
Topology-aware alarm suppressions
When performing maintenance, identifying all alarm rules of the resources included in the event can be an alarming task that can lead to alarm fatigue. For example, when patching a fleet of hosts, identifying all the alarm rules for the hosts and resources running on the host can be time consuming and error prone. Maintenance Windows simplify this process by changing the focus to identifying the resources under maintenance instead of individual alarm rules. As such, simply focus on identifying the hosts under maintenance, including filtering the desired set of hosts using tags. Since Stack Monitoring is topology-aware, the Maintenance Window will automatically identify all the resources running on those hosts and include all the associated alarm rules across the entire set.
Maintenance Windows are not limited to host maintenance. Maintenance for all resources monitored in Stack Monitoring benefit from Maintenance Windows. As another example, consider a scenario where you need to patch an Oracle Database System. When creating a Maintenance Window, simply include the Oracle Database System resource (DBSYS9145 in image 2 below), and Stack Monitoring will automatically identify the associated CDB, PDB, and listeners, and identify and suppress all their associated alarm rules. Had this database system included ASM, this would have been included as well. The inclusion of the related resource and identification of the corresponding alarm rules reduces the burden on you, the DevOps engineers, to identify all impacted resources and their associated alarms in a maintenance event.
Stack Monitoring’s Maintenance Windows provide flexible scheduling to meet your maintenance needs
- When unplanned maintenance or incidents occur, a Maintenance Window can be scheduled to begin immediately.
- When performing planned maintenance, schedule the Maintenance Window’s start date and time.
- Set the duration, or set a specific end date and time. Defining a Maintenance Window in advance, removes a step in the maintenance work plan and allows you to stay focused on the system changes and less on administrative work.
Identifying resources in an active Maintenance Windows
- Maintenance Window UI provides a comprehensive list of all active, scheduled, and completed Maintenance Windows.
- Stack Monitoring home page UI also provides a banner to identify that a resource is currently in an active Maintenance Window. This banner includes the start and end time of the Maintenance Window as well as a link to the Maintenance Window itself for more information regarding the event.
Another way to identify if a resource is in maintenance is the wrench icon. This icon is a key Stack Monitoring UI feature that identifies any resource associated with an active Maintenance Window.
When reviewing open alarms, the Maintenance Window icon allows you to skip over those resources identified as currently included in a Maintenance Window and focus on open alarms that are not currently suppressed.
Application and infrastructure maintenance can be stressful, your monitoring shouldn’t be. Get started today with OCI Stack Monitoring’s Maintenance Windows to help remove some of the burden on you and place the focus where it should be, on implementing successful production changes. Happy Monitoring!
