WANdisco LiveData Migrator for Oracle Cloud

January 31, 2022 | 6 minute read
Paul Scott-Murphy
CTO at WANdisco
Text Size 100%:

A data lakehouse combines the cost and scale advantages of the data lake with the analytic capabilities of data warehouse technologies. It enables you to ingest, store, analyze, and derive business value from large and changing datasets. Oracle Cloud Infrastructure (OCI) provides the foundation for the lakehouse as an open and flexible approach to analyzing any type of data, at any scale, from anywhere. However, migrating your data lake into OCI to use this architecture can be challenging without the right support.

Oracle has partnered with WANdisco to make the unique benefits of WANdisco LiveData Migrator available to OCI users as a part of Oracle Cloud Lift services. This migration solution is ready for organizations that want to move away from costly and high maintenance traditional platforms, like Hadoop and Spark on-premises or in other clouds, without disrupting the operation of those platforms during migration.

The unique LiveData capabilities from WANdisco mean that data previously locked on-premises or in first-generation cloud environments becomes immediately available and usable across all OCI. Organizations can gain the business advantages of modernization in OCI without risk or delay.

A data lakehouse on OCI with WANdisco enables data modernization

Data is increasingly varied, spanning structured and unstructured sources, growing exponentially, and becoming available from a broader range of sources. Keeping ahead of this growth is challenging without the right infrastructure, processes, and technologies to consume, manage, correlate, and analyze varied data at scale. These challenges have led to the emergence of the lakehouse as the best approach to meeting business requirements from data at scale.

The business advantages of the lakehouse architecture come from making data more accessible and capable at scale. It supports better, faster, more accurate, and more comprehensive business processes by combining the scale and cost advantages of the data lake with the speed and accessibility of the data warehouse. The technical advantages of the lakehouse support what users need to drive analytics, machine learning, and AI outcomes through immediate access to more and broader data, more efficiently at lower cost than traditional data architectures.

Oracle’s unique capabilities with data lakehouse architectures start with the core functionality for object and relational data storage. OCI services incorporate the following capabilities for the data lakehouse:

  • Data movement

  • Managed open source

  • Data warehousing

  • Data definition and discovery

  • AI services for automation, preparation, and prediction

The outcome is the consistent and comprehensive lakehouse architecture that builds on the next generation cloud technology in OCI. This architecture utilizes important innovations that make OCI a natural and compelling platform for organizations to modernize every aspect of their business. Technologies, such as serverless Spark with Data Flow, Autonomous Data Warehouse, OCI Object Storage, and Oracle Big Data service, combine to provide significant technical advantages that underpin a lakehouse.

Lakehouses are built from the foundations of existing data in data lakes, warehouses, on-premises applications, and systems. OCI provides the infrastructure and services necessary to establish a lakehouse but doesn’t provide the proprietary data, metadata, processes, and other workloads that are unique to each organization. You bring your own data, analytics requirements, data generation, and ingest mechanisms to complement the critical lakehouse-capable functionality of OCI.

The same way that organizations benefit from bringing their own data to OCI, they can’t benefit from abandoning existing investments in on-premises data sets, applications, workflows, and metadata. WANdisco’s migration solution can help organizations simplify, automate, and lower the risk in the significant challenge to make existing data investments available to an Oracle lakehouse without rearchitecture or business downtime.

WANdisco LiveData Migrator

LiveData Migrator automates the large-scale movement of data and metadata from existing on-premises data lakes, Spark, and Hadoop environments to OCI. By making data and metadata migration as simple as selecting the needed datasets needed, organizations can continue to operate all their existing data infrastructure without any disruption. At the same time, they can introduce WANdisco LiveData Migrator, through which they can select the data and metadata that needs to be available in OCI as that foundation on which the lakehouse is established.

WANdisco’s technology automates the migration of changing datasets at scale and is designed and proven to scale to multipetabyte data lake migrations with ease. OCI adopters now have the following key innovations available through WANdisco’s technology:

  • Live data migration allows you to migrate actively changing data without scheduling repeated scans of the source datasets. Migration begins and completes without any change to the source and without needing to restrict data movement to periods of downtime or reduced usage at the source. You can eliminate costly planning and modifications to source systems just for migration to OCI.

  • Selective data migration ensures that you don’t incur the overhead of moving unneeded data. You can bypass temporary files, staging locations, intermediate representations, and more to ensure that the only data delivered to your lakehouse is what you want available.

  • Selective metadata migration makes the information describing your data available to OCI Data Catalog, so you can immediately access and process structured or semi-structured data with ease.

  • Data consistency provides comprehensive facilities to ensure that data arrives in your lakehouse in the same form present in your sources. WANdisco LiveData Migrator provides confidence that your analytic outcomes from OCI are wholly accurate and comprehensive.

  • Broad compatibility allows you to migrate a wide range of different source data lakes, whether they’re on-premises, in other clouds, or built on platforms like Hadoop or Spark.

By migrating existing and new data without disruption to source environments that you operate on-premises or in other clouds, WANdisco greatly reduces the cost and effort required to apply all the functionality and benefits of an Oracle lakehouse. You can use the shared data, analytics, governance, security, and scalability of OCI, even while the migration of existing data and workloads is underway.

Live Data Migrator operation

A screenshot of the WANdisco dashboard.

Migrating your data and metadata to Oracle Cloud takes the following steps:

  1. Install LiveData Migrator on an edge node of your source Hadoop environment without any service restarts or application changes.

  2. Specify your target OCI Object Storage and OCI Data Catalog.

  3. Start your data and metadata migrations by selecting the content that you want to migrate.

All the other capabilities of LiveData Migrator that ease management and operation for migrations at any scale are available in full. You can monitor and validate migration outcomes as you need, all while continuing to operate the source platform without any change to application behavior, or disruption to service.

Next steps

Migration to OCI isn’t just about moving data and workloads. A successful migration must also incorporate an understanding of how the target lakehouse architecture can be used most effectively to modernize how an organization benefits from its data. You can refer to excellent examples of how organizations like Experian and Ingersoll Rand expanded their business goals and delivered on them through adoption of OCI and a lakehouse architecture in the Oracle Live recording, The Future of the Data Lakehouse.

WANdisco customers are automating cloud migrations for exabytes of data using LiveData Migrator, which simplifies and eliminates the traditional risks associated with data migration at scale. However quickly you want to bring your Hadoop, Spark, or object storage data and metadata to OCI, having WANdisco’s technology available to support that effort can accelerate the time in which you benefit from a lakehouse architecture in Oracle Cloud Infrastructure.

For more information, visit LiveData Migrator for Oracle.

Guest Author

Paul Scott-Murphy

CTO at WANdisco

Paul Scott-Murphy, Chief Technology Officer at WANdisco, is responsible for the company’s product and technology strategy, including industry engagement, technical innovation, new market and product initiation and creation. This includes direct interaction with the majority of WANdisco’s significant customers, partners and prospects. Previously VP of product management for WANdisco, and Regional Chief Technology Office for TIBCO Software in Asia Pacific and Japan, Paul has a Bachelor of Science with first class honors and a Bachelor of Engineering with first class honors from the University of Western Australia.

Previous Post

How to prepare for the OCI Security Associate Certification

Rohit Rahi | 6 min read

Next Post

Jump-Start Development by Using Oracle Autonomous Linux and Always Free Services

Julie Wong | 3 min read