Guest post by Carl Olofson, Research Vice President, Data Management Software, IDC

 

Machine learning (ML) has captured the imaginations of those involved in advanced analytics for over a decade, fueling a desire to create systems that can intelligently classify data and respond to those classifications by optimizing data used and deriving inferences in multiple areas including marketing, operations optimization, and fraud detection, among others. The process involves capturing a relevant sample of data in a particular area and loading it into a workspace where an artificial intelligence (AI) system can be trained to recognize patterns in the data that are pervasive yet subtle, and to score the significance of data patterns in a way that enables smart identification and decision making.

When such data comes from a managed database, it is offloaded from the database into a file associated with the ML system, which then scans the data. The system looks for correlations according to an AI algorithm and identifies data patterns in relation to those correlations. The resulting identifications are applied by the AI system against the live dataset to drive various actions and outcomes. However, because the training data must be shipped from the database and put somewhere else and the models shipped back, this process tends to be time-consuming and creates security gaps.

Now there is an easier way. On March 29th, Oracle announced the latest version of MySQL HeatWave, the database cloud service based on the company’s open-source MySQL. Oracle MySQL HeatWave ML includes built-in ML functionality within the database, delivered in a manner that is easy to use. The new version promises to be a game changer for application developers and a broad range of data analysts and scientists.

Unlike the common case where data is moved elsewhere in order to train the ML system, in MySQL Heatwave ML training takes place right in the database system. In addition to providing in-database training, HeatWave ML provides full automation of the process, eliminating manual steps with their attendant risk of human error or bias. HeatWave ML leverages the scalability of HeatWave clusters to deliver highly robust performance. It automates many aspects of the various stages of preparing and sampling the data, and it offers support for algorithm selection and hyper-parameter tuning, saving time and effort. At the end of the process, HeatWave ML provides explanations for all its models, which help organizations with regulatory compliance (why a mortgage or credit card application was approved or denied) and builds trust in machine learning. Also, whereas most systems only use analytic data to train their models, MySQL HeatWave ML supports the use of live transactional data as well. MySQL HeatWave ML runs natively on Oracle Cloud Infrastructure (OCI) and is available at no charge to MySQL HeatWave users.

Other enhancements to MySQL HeatWave in this announcement include much greater dynamic cluster management, deeper data compression, and a “pause-and-resume” feature, which allows users to essentially “press pause” when they will not be using the database for an extended period of time. There is no charge for database services while the database is paused. Although HeatWave features an analytic transaction processing capability, mixing transactions with analytics in single operations, its performance in dealing with pure analytic workloads is impressive. Oracle recently released TPC-DS benchmark results that back up that assertion.

Oracle MySQL HeatWave ML represents a significant step forward. AI functionality is one of the hottest areas of data analytics and of data-driven operations. However, the slow and balky nature of normal ML practices and procedures has inhibited the utility and broad use of AI/ML in many cases. Providing this capability in the database, able to run on live transactional data, and with programmatic support beyond just data scientists that enables a shorter and easier path to valuable results, is a material accomplishment. This also means greater scalability, so adding more resources improves

the performance of the application. In the previous release, Oracle published MySQL HeatWave scalability numbers for SQL analytics. This release adds real-time elasticity with the promise of zero downtime while demonstrating scalability for machine learning workloads. The scale-out architecture of MySQL HeatWave, with its focus on scalability, represents key differentiation in this market

Since MySQL is the most popular open-source relational database system, the possibilities for HeatWave ML in terms of adding intelligence to applications of many sizes and shapes are enormous. One key benefit is that the MySQL user gets this capability without changing a single line of existing application code or moving, converting, or transforming existing data.

IDC has predicted that AI/ML spending will grow by nearly 20% in 2022, according to our Worldwide Semiannual Artificial Intelligence Tracker. Much of that spend will be on high-end technologies understandable and available to only an elite few. With HeatWave ML, Oracle is providing a capability that is much more broadly consumable, on a platform that is familiar to millions. While Oracle may have a lead in certain regards, IDC expects competition in the area of database embedded AI/ML to be hot for some time to come. For MySQL users, however, this release represents a substantial leap in performance and functionality that enables more sophisticated data processing without technical complexity.