The advent of data lakes was a huge change for businesses. Low-cost storage made it possible to store and use data that would otherwise be discarded. Decoupling of compute and storage meant much more efficient use of resources. And open-source tools and frameworks like Spark and Python made it easier to load and transform data and use machine learning. The cost to experiment with new data was much lower and that opened up new possibilities.
But one thing didn’t change. The vast majority of analytics still uses SQL, whether it’s somebody writing SQL queries directly, or using tools like Power BI or Oracle Analytics Cloud that use SQL behind the scenes. You can use SQL with traditional data lakes, but the focus there has been on lowering cost, not optimizing for SQL performance. What if you could get all the benefits of a data lake, particularly the low cost of storage and separation of compute, but with SQL performance up to 20x faster?
Now you can. The cost of storage for Autonomous Data Warehouse is now reduced by over 75%. Storing data in Oracle Autonomous Data Warehouse’s highly optimized Exadata Database storage architecture is the same low cost as placing it in object storage. The big difference is a significant performance boost. Exadata Database storage is designed for database workloads: SQL processing and analytics takes places in the storage tier, optimized networking reduces IO latency, sophisticated caching improves throughput, and more. As a result, SQL workloads will run up to 20x faster.
This lets you rethink where data should be stored. Before this price change, object storage was used for handling raw data, transforming data, staging valuable data prior to loading into a data warehouse, and storing older or less critical data. Plenty of SQL workloads ran against data in the lake, but lower performance was acceptable given the lower costs.
But now that object storage and database storage cost the same, you can simply place data where you get the highest performance. So, while it still makes sense to land raw data in a data lake, and transform it using Spark-based tools, the output can now go straight to high-performance Autonomous Data Warehouse storage. This low cost storage is combined with elastic, auto scale compute resources – allowing you to only pay for the compute you need. Costs will go down, and your analytics performance goes up. And with new, open data-sharing, that data is also easily and securely accessible to other users and applications (including Spark), both inside and outside the organization.
The new storage costs are just one of several new innovations that you may want to check out:
Blogs:
Demos:
Press Release: