MySQL HeatWave Lakehouse brings transactional and semistructured data into one high-speed query engine

October 19, 2022 | 4 minute read
Jeffrey Erickson
Director of tech content
Text Size 100%:

Oracle MySQL HeatWave Lakehouse brings transactional data into one query

 

Oracle MySQL HeatWave is the gift that keeps on giving. Launched in November 2020, MySQL HeatWave brings transaction processing, analytics, and machine learning inside the MySQL database for near real-time analytics. Now Oracle is announcing MySQL HeatWave Lakehouse, which lets you query transactional data in the MySQL database and terabytes of data in object store—all in a single query.

MySQL HeatWave Lakehouse is aimed at speed and efficiency. It can process hundreds of terabytes of data in object store in a variety of file formats, including CSV and Parquet, as well as in backups to Amazon’s Aurora and Redshift. The ability to run analytics and machine learning in the database will make MySQL HeatWave Lakehouse easy to use while also giving it a huge performance advantage for both loading data and running queries.  

In an industry first, “MySQL HeatWave provides one integrated service for transaction processing, analytics across data warehouses and data lakes, and machine learning without ETL,” said Oracle’s chief corporate architect, Edward Screven, onstage October 18 at Oracle CloudWorld in Las Vegas. MySQL HeatWave can directly query data in files and objects and process it using analytic queries. That means for the first time, MySQL users “don't need to go through an ETL phase in order to use the data that you have,” Screven said. “I mean, this is astounding, right?”

Data lakehouse technology combines conventional data warehouse transactional data with data lake technology for storing and analyzing semistructured data. By combining the two data sources, business leaders get much deeper insights into their business performance. The insights from a data lakehouse are often more valuable the closer the data is to real time. So technologists who use MySQL have, until now, been forced to move live data out of their MySQL database and then move it into a separate database for analysis. The extra steps add cost, complexity, and time to their processes and lock them out of real-time analysis. In contrast, with MySQL HeatWave transactions are moved into the analytics engine as they occur, so the analyzed data is always in real time.

“Massive improvements”

MySQL HeatWave Lakehouse, which Oracle expects to make generally available next year, “delivers massive improvements in performance, automation, and cost” compared with traditional methods used by the competition, Screven said.

The performance results are dramatic. MySQL HeatWave Lakehouse delivers query performance that’s 17X faster than Snowflake and 6X faster than Redshift on 400 TB workloads. Loading data into MySQL HeatWave Lakehouse is also much faster—8X faster than Redshift and 2.7X faster than Snowflake. Benchmark scripts are available on GitHub for anyone to replicate.

“Take a step back and think about that,” said Screven. “Snowflake’s entire business is based on loading data and querying data from files. We added Lakehouse support to MySQL HeatWave, and we beat them in every way that matters. We query 17 times faster and we load 2.7 times faster.” Screven said the hardware configurations Oracle used to run the test also were less expensive than those used for the other vendors.

Screven was joined on the CloudWorld stage by Oracle’s head of MySQL HeatWave development, Nipun Agarwal, who showed MySQL HeatWave Lakehouse’s ease of use in demos that included the availability of MySQL HeatWave on cloud infrastructures such as AWS and Azure and a machine learning automation engine called Oracle MySQL Autopilot.

MySQL Autopilot provides workload-aware, machine learning–based automation to MySQL HeatWave. New MySQL Autopilot capabilities for MySQL HeatWave Lakehouse on Oracle Cloud Infrastructure (OCI) will improve performance and reduce administration with features such as auto schema inference, adaptive data sampling, auto load, and adaptive data flow. “It uses machine learning and advanced optimization techniques to figure out how to run queries better, how to load data better, how to place data better, or how to compress data better,” said Screven.  

Multicloud and on-premises availability

Oracle is making MySQL HeatWave available where organizations are and how they like to work. Database teams can replicate data from their on-premises MySQL OLTP applications to MySQL HeatWave running in multiple clouds, including OCI, AWS, and Microsoft Azure. For organizations that prefer to avoid moving their database workloads to the public cloud, it’s available in an organization’s own data center as part of OCI Dedicated Region. MySQL HeatWave is updated to stay on the latest version of the MySQL database. My SQL HeatWave Lakehouse is currently in beta and is slated for general availability in 2023.

Learn more

Read the MySQL HeatWave Lakehouse press release

See the MySQL HeatWave technical brief (PDF)

Get hands-on training with MySQL HeatWave workshops

 

 

Jeffrey Erickson

Director of tech content

Jeff Erickson is director of tech content at Oracle. You can follow him on twitter @erickson4.


Previous Post

Oracle B2B Commerce makes corporate finance, logistics easier

Mark Jackley | 3 min read

Next Post


Oracle Applications Platform empowers developers to extend Fusion applications

Joseph Tsidulko | 3 min read