Just a few days ago, we wrote about the newest release of Data Integration Platform Cloud (DIPC) 18.3.3. This is all very exciting! Now, a few bits in an effort to share a bit more on the two newest Elevated Tasks and inclusion of Stream Analytics to DIPC.
This release of DIPC helps with data lake automation, enabling an intuitive instantiation and copy of data into a data lake, in an effort to help reduce some of the existing data engineer/ data scientist friction through a new Data Lake Builder task. You can quickly create a comprehensive, end-to-end repeatable data pipeline to your data lake. And – note that nothing is moved to data lake without being fully governed!
When you add data to the data lake, DIPC follows a repeatable pattern to harvest, profile, ingest, shape, copy, and catalog this data. Data can be ingested from a variety of sources, including relational sources, flat files, etc. Harvested metadata will be stored in the DIPC Catalog, and the data will be transformed and secured within the target data lake for downstream activities. For more information, see Adding Data to Data Lake.
The Replicate Data Task helps address high availability. Replicate into Oracle… or Kafka! And, bring that together with Stream Analytics whereby event process is made possible on real-time data streams, including Spatial, Machine Learning, queries on the data stream or cubes. With Stream Analytics, you can analyze complex event data streams that DIPC consumes using sophisticated correlation patterns, enrichment, and machine learning to provide insights and real-time business decisions.
Very simply, the Replicate Data Task delivers changes from your source data to the target. You set up connections to your source and target, and from the moment that you run this task, any new transaction in the source data is captured and delivered to the target. This task doesn't perform an initial copy of the source (for the initial load see Setting up a Synchronize Data Task) so you'll get all the changes from the point of time that you started your job. This task is especially ideal for streaming data to Kafka targets. For more information, see Setting up a Replicate Data Task.