In the current release of Data Integration Services, a new feature has been introduced: templates. These prebuilt sets of tasks can be leveraged to simplify day-to-day data operations within the service or across OCI and multi-clouds. The tasks, data flow, and pipeline design artifacts can also be saved and modified for a new custom application from the template. This blog covers the steps to create an application from the templates, configure and execute tasks, use them in a pipeline or schedule, and eventually modify them to include your customizations.

A typical workflow from the applications creation menu is to select one of the available templates and create an application from that template. A preview will allow a closer inspection of the template components. Figure 1 below depicts one complex template with multiple activities to be executed to make it active.

Oracle DB to ADW Incremental Load Template preview

Figure – 1 Oracle DB to ADW Incremental Load Template preview

REST Task Templates

Some templates provide tasks that can be executed off the shelf. The current release of templates explores a combination of REST tasks varying from operators managing objects in the Oracle Object Store to service calls to other OCI services such as the AI Image and Document classification and the elaborated pipeline to identify and mask fields using OCI Data Safe.

For instance, creating the application from the Oracle Object Store Management template will produce an application with the following tasks, as Figure 2 displays below.

Oracle Object Store Management Template tasks

Figure – 2 Oracle Object Store Management Template tasks

The tasks on this template are parameterized. Therefore, just requesting the task to run would allow it to get executed in your tenancy, region, and bucket, provided the policy statements allow the Data Integration Service (DIS) to do so, as described in the preview section. Figure 3 shows the parameter window for task execution.

Task RenameObject parameters from Oracle Object Store Management Template

Figure – 3 Task RenameObject parameters from Oracle Object Store Management Template

These parameters include the source and target object names and the file names in the object store. For the region, examples are us-ashburn-1, eu-frankfurt-1, ap-mumbai-1, and so far. Namespace and tenant name are usually the same as the tenant name (note for tenancies older than December 2019, the namespace may differ from the tenant name. The bucket information in the console has information about the Object Store Namespace).

The example below illustrates running the RenameObject task, replacing the default values, mainly passing the metering_hourly_detail.csv file to be renamed as processed/metering_hourly_detail.csv in the template_validation bucket. This creates the processed folder in that bucket, providing the idea of how to archive files that have been processed quickly.

RenameObject task executed by passing the parameters

Illustration – 1 RenameObject task executed by passing the parameters

Templates with Data Assets

For conventional processing of data, for instance, importing data into OCI Autonomous Databases, templates expose Integration Tasks controlling the data pulls in increments or loading different data formats such as CSV, JSON, Parquet, and Avro into relational structures. 

This second set of templates involves the use of Data Assets. For these templates, after copying the application, it is necessary to ‘remap’ the available data assets in the workspace. The assets usually refer to an Object Store, Autonomous or Oracle Database, and eventually a Fusion Application data asset. 

An example of new References for Oracle DB to the ADW Incremental Load template is shown in Figure 4.

Oracle DB to ADW Incremental Load - Data Assets References remap

 

Figure – 4 Oracle DB to ADW Incremental Load – Data Assets References remap

Generally, the tasks within the template can also be remapped to other tasks. In the case of the template above, if you choose to use a different nomenclature for the tasks involved in this template due to further customization, these tasks can also be remapped using the Tasks tab.   

In the case of this incremental load template above, the remap of the data assets to the ones you might have configured in your workspace would allow the tasks to run forward, using the parameter sections for each task. For instance, Figure 5 shows the parameter window for the  Oracle to ADW Incremental Pipeline Task. Please follow this tutorial for assistance setting up Data Assets in your workspace.

Oracle DB to ADW Pipeline Tasks - Parameters for the task execution window

Figure – 5 Oracle DB to ADW Pipeline Tasks – Parameters for the task execution window

In the case of this template, there are parameters to identify the Schema, Entity, and Column name from the source subject of the incremental pull by reading a metadata table. Note the data type Varchar.

Then a parameter is to identify the schema where incremental metadata resides in the Target data asset.

Customizing Templates

But wouldn’t it be great if I could eventually customize the tasks on this template? For instance, add a further transformation in the integration tasks or create another task to run within the template pipeline.

Once the template data assets have been remapped, we can View the Definition of a particular task in the template and Save As in a new Project or Folder. This will copy the Data Flow or Pipeline design artifacts used in the template application and reference the existing Data Assets available in your workspace. In that way, the design artifacts are fully functional throughout the DIS console user interface.

The illustration below reviews the dataflow embedded in the Oracle DB to ADW Incremental Load template, then save it as a copy in the specified folder. Within that dataflow in the DIS design canvas, it is possible to change the default values for the parameters used in the original. Therefore, further modifications to this template task can be introduced and leveraged in another application. Note the error produced “Unable to retrieve data entities…” is expected. The data assets have been modified, but the template default entity does not exist in the new schema. Changing the parameter default values would allow the data flow to validate successfully.

Review and Save As templates tasks into another Folder to access design components

Illustration – 2 Review and Save As templates tasks into another Folder to access the design components

Conclusion

Many daily management tasks can be easily automated using or reusing template tasks. In addition, templates expand the DIS data processing and management capabilities by offering a distinct set of tasks tailored to assist data engineers. Use cases to call other OCI services such as AI for document classifications, Data Safe for masking content to be stored, and controlling and reporting in the incremental feed to ADW are template building blocks to ease of DIS use. 

The list of templates currently available on this release are:

  • Oracle Object Store Management Template
    • Application with REST tasks for Object Storage to copy, delete, and rename objects and to create and delete buckets.
  • Oracle Vision Image Template
    • Application with REST tasks for performing OCI Vision Image Analysis. The tasks include image classification, object detection, and image text detection.
  • Oracle Vision Document Template
    • Application with REST tasks for performing OCI Vision Document AI. The tasks include document classification, document key-value detection, document language classification, document table detection, and document text detection. 
  • Oracle DataSafe Masking Template
    • Application with parameterized tasks to generate a Data Safe sensitive model and masking from a target Oracle database schema.
  • Load Files from Object Storage to ADW template

    • Application with tasks to load different file types from OCI Object Storage into Autonomous Data Warehouse: JSON, Parquet, CSV, Avro.
  • Oracle DB to ADW Incremental Load (Customer Managed) template

    • Application that allows incremental tasks to run based on and report the last execution in a metadata table stored in an Autonomous Data Warehouse target schema.

And more templates will come with each new release, making the DIS implementations on the use or even research more straightforward.