At Oracle Cloud Infrastructure (OCI), we strive to continuously improve our Anomaly Detection service. OCI Anomaly Detection enables you to monitor and detect anomalies in your time-series data. Today, we’re excited to announce the general availability of the following new capabilities for the Anomaly Detection service:
Univariate anomaly detection
Multivariate anomaly detection improvements
Asynchronous detection
Univariate anomaly detection (UAD) refers to the problem of identifying anomalies in a single time series data. A single time series data contains timestamped values for one signal, such as metric or measure.
With this release, we now have fully fledged support for detecting anomalies in univariate signals that allow you to detect different types of anomalies in univariate signals: Point, collective, and contextual anomalies.
Point anomaly detection finds single data points that are unusual compared to the rest of the data in the dataset.
Collective anomaly detection finds related anomalous data instances compared to the whole dataset.
Contextual anomaly detection finds data points that are considered abnormal when viewed against contextual attributes associated with the data points. Viewing data in the context of time, or in the context of time-related concepts such as seasons, weekdays, and weekends, can reveal anomalous behavior directly correlated with such context.
Other improvements include the following examples:
Performance improvements for both training and evaluation
Now using fast numerical expression evaluator for NumPy (NumExpr) instead of Numpy for algebraic and transcendental function evaluations
Intel Math Kernel Library (MKL) support to accelerate function evaluation
Improve performance of sklearn pair wise distance metric calculation for improved detection: Implemented step-wise matrix multiplication to replace loops used in sklearn package
Implement efficient memory handling for large batch size (Up to 10K): Implemented batch-based column processing to calculate the pair-wise distance to avoid memory issues due to storing large matrix
Preprocessing improvements
Inter quartile range (IQR) based outlier detection and removal
Trend and seasonality decomposition: Seasonal trend decomposition using Loess (STL) or linear detrending
Kernel improvements for one-class support vector machines (OCSVM) using automatic hyperparameter tuning: Dynamic window size selection using periodicity detection and autocorrelation function and heuristic based frequency detector.
Postprocessing improvements during detection: We prune excess anomalies by suppressing anomalies that appear consecutively in groups larger than the window size to avoid excessive flagging beyond window size data points.
User-specified tuning: Added sensitivity parameter in detection allows you to adjust the number of anomalies flagged by selecting the appropriate threshold, without having to retrain.
This release also introduces MSET2 Multivariate AD kernel to support large batch size calls with asynchronous detection, which greatly improves detection accuracy specifically for prognostics use cases. This capability helps surveillance-based anomaly detection use cases to detect anomalies in the context of a historical state with the following details:
Call the service using explicit option for multivariate MSET using asynchronous detection.
The service computes states based on cumulative sums using appropriately large batch size internally.
This availability offers improved performance by retaining the historical context, resulting in a lower missed alarm rate when compared synchronous detection.
Customers can now use the Asynchronous Detection API on large to very large data sets (100 million–billions of data points). This API has the following capabilities:
Extends the existing Anomaly Detection service capabilities
Supports large datasets (From 30K data points to 100 million+ data points)
Supports Training Data with up to 1000+ signals (Available on request)
High model accuracy by enabling model training with better model characteristics, such as window size and memory vectors
Allows input inline or a list of objects in OCI Object Storage. The different modes of input provide you flexibility before onboarding to the service.
Frees you from having to develop custom apps to perform anomaly detection on large data sets
Can be extended to provide more capabilities such as automated preprocessing, retraining, and ground truth integration for continuous model improvements.
Other Asynchronous improvements include the following examples:
Encryption of intermediate data
Load balancing: IP virtual service (IPVS) for network routing within K8s cluster for load balancing
Parallel request handling from database queries: Optimistic locking for database queries to handle parallel requests
Autoscaling for pods horizontally and clusters
For more information on Oracle Cloud Infrastructure Anomaly Detection service, see the following resources:
Try OCI Anomaly Detection with LiveLabs
Previous Post