We are proud to announce a validated reference architecture of Cloudera Enterprise Data Hub on Oracle Cloud Infrastructure. Starting today you can deploy Cloudera's industry-leading big data technology on Oracle's high performance cloud with full Cloudera support.
The Cloudera and Oracle partnership allows customers to deploy comprehensive data strategies, from business operations to data warehousing, data science, data engineering, streaming, and real-time analytics, all on a unified enterprise cloud platform with unmatched performance, security, and availability.
Cloudera Enterprise Data Hub brings together the best big data technologies from the Apache Hadoop ecosystem, including HDFS, HBase, Hive, Spark, Impala, Solr, and Kudu, and adds consistent security, granular governance, and full support. It is the fastest, most secure, and easiest to use big data software available.
Cloudera is a great choice for a variety of big data use cases, including:
Cloudera on Oracle Cloud Infrastructure is a joint solution that combines the flexibility and performance of Oracle Cloud Infrastructure with the scalable data management of Cloudera Enterprise Data Hub. Our solution enables customers to realize their data strategies, from operational to analytics, with amazing performance, an unmatched data ecosystem, and the inherent benefits of moving from on-premises fixed infrastructure to elastic cloud infrastructure.
Blazing Fast Big Data Performance
Oracle offers the most powerful bare metal compute instances with local flash storage in the industry. In a TeraSort benchmark test, sorting 10 terabytes of data using 10 worker nodes on Oracle Cloud can be done in about 45 minutes. Although this scale is only a fraction of what is possible, this graph of the benchmark shows the impact of both bare metal versus VM and of local storage versus block storage.
Only Oracle offers this big data ready local storage, based on advanced NVMe SSD technology, and backed by a storage performance SLA.
The bare metal compute instances are connected in clusters to a non-oversubscribed 25-gigabit network infrastructure, guaranteeing extremely low latency and very high throughput, which is a key requirement for high performance big data workloads. In fact, Oracle Cloud Infrastructure is the only cloud, with a network throughput performance SLA.
Unmatched Data Ecosystem
Cloudera clusters that are spun up in the cloud can sit right next to Exadata or Oracle Database environments over private networks, allowing easy data sharing for analytics purposes. Gartner regards Oracle as one of the top three vendors in the Data Management Storage Analytics space, making Cloudera on Oracle Cloud Infrastructure a great choice for running analytics workloads
Right-Size Your infrastructure in the Cloud
Cloud infrastructure enables you to deploy the optimal amount of infrastructure to meet your demands. No more under-utilization of too much infrastructure, or long queues due to under-forecasting. In addition, Oracle offers:
You can easily deploy Cloudera Enterprise Data Hub on Oracle Cloud Infrastructure by using Terraform automation.
There are multiple Terraform templates for deploying a fully configured Cloudera Enterprise Data Hub instance or cluster on Oracle Cloud Infrastructure. Currently you can choose Sandbox, Development, Production Starter, and N-Node (which is configurable for clusters of any scale). For details about the Terraform templates, see the Readme.md file.
For more information about installing and using Terraform on Oracle Cloud Infrastructure, see Terraform on Oracle Cloud Infrastructure
A white paper that details a reference architecture for Cloudera Enterprise Data Hub on Oracle Cloud Infrastructure and the use of these Terraform templates is located at Cloudera Enterprise Data Hub Reference Architecture for Oracle Cloud Infrastructure Deployments
We hope you will be as excited as we are about the Cloudera plus Oracle solution. Let us know what you think!
Director, Product Management