Oracle Machine Learning supports a wide range of machine learning tools and interfaces for data scientists, ML engineers, data engineers, and application developers. As part of Oracle’s converged database, Oracle Machine Learning is one of the many capabilities readily available in Oracle Database and Oracle Autonomous Database, that provides an integrated – not fragmented— database supporting a wide range of data representations, workloads, and tools for delivering complete solutions.  Here are 10 reasons why you should use Oracle Machine Learning in Oracle Database and Oracle Autonomous Database.

Reason #1: Eliminate data movement

Loading data into third-party machine learning engines, whether from disk or databases, comes at a cost: the time to move data from its source to the target system, the time to load data into the engine’s memory, and the impact each of these has on scalability – memory, storage, and bandwidth – and complexity. Enterprises already using Oracle databases can easily avoid these costs by using machine learning algorithms built into the database engine for model development and deployment, as well as data exploration and preparation.

Reason #2: Reduced complexity

Using separate machine learning engines for modeling often requires additional installation, configuration, and/or management of those tools. With separate tools comes the need, for example, to develop mechanisms for data retrieval, local storage, fault tolerance, parallelism, container management, and security. More “moving parts” means more code to address error and failure conditions along with more complex testing requirements. This applies across the machine learning lifecycle – from data access and preparation to model building and scoring.

In addition, the resulting machine learning models need to be stored, managed, and secured. With Oracle Machine Learning, in-database machine learning models are right where the data reside – existing as first-class database objects in the user schema.

For batch and even real-time predictions, scoring with Oracle Machine Learning can be as simple as running a SQL query. Batch and singleton scoring results can be used dynamically or stored directly in the database.  

Reason #3: APIs for the top languages for data science

Python, R, and SQL are often cited as the top languages used for data science and machine learning. Oracle Machine Learning provides APIs in each of these languages to enable modeling with in-database machine learning algorithms. Data professionals know SQL is an essential tool for manipulating data. With Oracle Machine Learning, this same powerful language can be used for model deployment, and through PL/SQL, for model building. The Python and R APIs also enable users to explore and prepare data through database-optimized Python and R functions that operate on data frame proxy objects – leveraging the database as a high-performance compute engine. Supporting such languages means users can choose the most efficient language for the task or to match user preferences.

For Python and R, Oracle Machine Learning also enables invoking user-defined Python and R functions using database-spawned and controlled engines, along with data-parallel and task-parallel options. APIs are provided for Python, R, SQL, and REST.

Reason #4: No-code user interfaces

Not everyone can or prefers to write code. Even for those who do, no-code user interface can offer productivity gains and accelerate the machine learning process, e.g., for data scientists and ML engineers. For those who don’t code or are non-experts in the use of specific algorithms, no-code user interfaces as found in the Oracle Machine Learning AutoML UI and Oracle Data Miner enable users to take advantage of powerful in-database machine learning features.

Oracle Machine Learning AutoML UI automates model building with minimal user input – just specify the data table and the target in what’s called an “experiment” and the tool takes it from there to produce and rank machine learning models. Further, you can generate notebooks for selected ML models based on the OML4Py API. Further, you can deploy models to OML Services for scoring data to support real-time and streaming applications as well as asynchronous batch scoring – all using REST endpoints. 

The Oracle Data Miner user interface – a SQL Developer extension – supports the creation of analytical workflows through an easy-to-use drag-and-drop interface and automates many common machine learning steps, such as data preparation, model building, and model evaluation.

Reason #5: Data and model governance

With all the data breaches reported in recent years, data security is foremost on the minds of executives and LOB managers. Oracle Machine Learning implicitly benefits from the built-in security of Oracle Database and Autonomous Database to mitigate threats.

In addition to data security, in-database machine learning models are first-class objects in Oracle Database. They are created directly in the database and are immediately usable from the database environment. You can control access by granting and revoking permissions, audit user actions, and export and import machine learning models across databases. Further, the database serves as a queryable model repository.

Like other database objects, users can export and import models across databases, even across cloud and non-cloud Oracle Database instances. With Python and R, native objects can be stored directly in the database (as opposed to flat files) and managed as part of your database schema for security, backup, and recovery.

Reason #6: ML Algorithms designed for scalability and performance

Oracle Machine Learning provides a wide range of powerful machine learning algorithms that are designed for parallelism to take advantage of multiprocessor machines and multi-node clusters.  Oracle Machine Learning supports batch and real-time scoring at scale with optimizations that leverage Exadata storage-tier function pushdown. Here, SQL prediction operators are optimized for the Exadata platform by pushing computations down to the storage tier. Further, in-database algorithms are designed for optimized memory utilization – bringing data into memory incrementally as needed and caching models to share across queries, among others. 

For user-defined Python or R functions, OML4Py and OML4R embedded execution supports built-in data-parallel and task-parallel invocation of user-defined functions. This means, for example, that an open source scikit-learn SVM model can be used for scoring in parallel, across multiple Python engines spawned and controlled by the database environment in a controlled and automated way – using a single function invocation.

On Oracle Autonomous Database, you can take advantage of elastic scaling for machine learning modeling as well as scoring. Supporting model deployment from REST endpoints, OML Services with Oracle Autonomous Database supports real-time scoring using in-database models as well as third-party ONNX format models.

Reason #7: Automation

Automation is key to meet rapid machine learning model development cycles, especially when the process can be highly iterative and repetitive. Automation can occur at several levels: data preparation, building individual models, or broader modeling that involves selecting the best algorithms, automatically building and tuning models, and selecting the best model – called automated machine learning, or AutoML.

Oracle Machine Learning provides automation at each of these levels.  For example, individual algorithms typically have specific data preparation requirements, e.g., some accept only numeric data, normalized data, or binned data. Oracle Machine Learning supports algorithm-specific automatic data preparation, such as the one-hot encoding of categorical variables, and binning or normalizing numeric variables. Such transformations are stored with the model and automatically applied at the time of scoring. This allows you to focus on other aspects of the ML process but allowing you complete control to override this feature.  

Other individual algorithm automation includes OML supports integrated text mining where one or more columns can be identified as text, and features are automatically extracted and combined with other structured data for model building and scoring.

Partitioned models automate the creation of a model ensemble, where one model is built on each partition of data – specified by one or more columns. However, you don’t need to manage these individual models. Instead, you’re provided a single top-level model for simplified scoring.

Oracle Machine Learning supports AutoML through the OML4Py API and the no-code OML AutoML UI. Even data scientists who want to build models explicitly can use AutoML to get a jump on testing multiple algorithms and understanding the tuned hyperparameters. For non-experts, one pain point is the need to understand individual machine learning algorithms and their hyperparameters in enough detail to get high-quality models. AutoML eliminates the need to know about individual algorithm hyperparameter details.

The OML AutoML UI supports automated machine learning for both data scientist productivity and non-expert user access to powerful in-database algorithms. It accelerates machine learning projects by eliminating the repetitive nature of the machine learning process.

Reason #8: Cloud and on-premises

By virtue of being in Oracle Database, the same machine learning algorithms are available in databases both in the cloud and on premises. Users have the option to build models on premises and deploy to the cloud, and vice versa. For the cloud, this includes Oracle Database Cloud Service or any deployment of Oracle Database on cloud compute, as well as Oracle Autonomous Database for self-driving database operation.

On premises, you can take advantage of their own hardware, use the optimized Exadata platform, even with Cloud at Customer offerings.

Further, in-database models and those produced in ONNX format can be used in Oracle Autonomous Database via OML Services, whether those models were produced in the cloud or on premises.

Reason #9: Ease of deployment

Even when machine learning models produce useful results, getting a solution to production can offer challenges in model storage and management, limited language APIs, and the ability to deploy R and Python-based solutions without having to spawn and manage R and Python engines. Oracle Machine Learning provides options to address each of these.

OML in-database machine learning models exist in the database schema immediately when built and can be used from SQL, Python, and R. The ability to score data from SQL means that any application or dashboard that accesses a database can also dynamically score data, whether in batch or individually. you can also deploy in-database models to OML Services for access through REST endpoints. As noted above, OML Services accommodates non-native models in ONNX format, e.g., as produced via TensorFlow or other libraries.

For solutions scripted in Python or R, users can deploy their user-defined R and Python functions from Oracle Database – with built-in data-parallel and task-parallel options and SQL and REST interfaces.

Part of solution deployment includes monitoring data and machine learning models for changes in characteristics or statistical properties. When data drift or model drift occurs, you’ll want to investigate causes and possibly rebuild machine learning models. OML Services provides REST endpoints for scheduling monitoring jobs on a recuring schedule. 

Reason #10: Included with Oracle Database and Oracle Autonomous Database

Applicable Oracle Machine Learning features are included with Oracle Autonomous Database, Oracle Database Cloud Service, and licenses of Oracle Database at no additional cost.

Further, Oracle Cloud Infrastructure with Oracle Autonomous Database supports predictable performance at an affordable and predicable cost and provides the best price/performance available in the market to date.  Workloads deployed on Oracle Cloud Infrastructure often require fewer compute servers and block-storage volumes—lowering the cost of delivering optimized workload performance. With Autonomous Database, users gain the benefit of machine learning functionality that is automatically provisioned, configured, and managed.

For more information…

So, there you have it. Ten reasons to try and adopt Oracle Machine Learning with Oracle Database and Oracle Autonomous Database for machine learning.

With Oracle LiveLabs, you can quickly spin up an Autonomous Database instance and explore various use cases and scenarios using OML technology. Use the “green button” to get working in a few minutes.

Included with OML Notebooks on Autonomous Database are over 70 “template example” notebooks for our SQL Python, and REST interfaces. You can see how to use various OML features and then create editable notebooks.

Check out our webpage, blog and documentation. Additional examples are included at our machine learning Github repository. Through our OML Office Hours, view the rich library of recorded sessions that highlight OML product demonstrations, use cases, and application integrations. Sign up to receive notifications of upcoming sessions today.