Oracle Analytics Cloud (OAC) data flows provide the ability to merge datasets in data preparation workflows. Combining data through a union presents several benefits including:
This article focuses on the best practices for combining datasets with unions so that you can effectively use this feature. If you're interested in learning about the different union types supported in OAC (Union All, Union, Intersect, among others) with several examples, visit this blog.
There are three general best practices to consider when combining data with unions from two source datasets:
The following description explains how you can implement these best practices using the tools in OAC.
There are a few methods to ensure that the input datasets you would like to combine contain the same number of columns. If you have read-write or full control access to the source datasets, open the dataset and look at the number of data elements (that is, number of columns) present in each table at the bottom of the screen. This approach is helpful if you're working with datasets that contain many columns. If you're working with smaller datasets, you can simply count the number of columns in the dataset Inspect panel, through the dataset preparation editor, or in the data flow itself.
You can use the reordering feature in the 'Select Columns' step to easily modify the order in which the columns appear in each input dataset. There are three main ways to rearrange columns using the reordering feature:
The arrows in the following image indicate the three locations where you can use the column reordering feature:
There are a few ways that you can check the column data types and several methods to change the data types.
1. Verify and optionally change the column data types in the data preparation editor. To change the data types in the source dataset itself, you need read-write permissions at a minimum. Open the dataset and click the column of interest. On the left-hand side, you see a panel with some information related to the column, including data type. To change the data type, click the ellipsis next to the column name and select Convert to <data type>.
2. Verify and optionally change the column data types in the data flow. This won't change the column type in the original dataset, only in the data flow. (This is the recommended approach if you don't have read-write access to the source dataset, or if you don't want to modify the dataset). You can change the data type of a column through the Transform Column step in a data flow. The following image shows converting the COST_FIXED column from a data type of text to a number data type using the Cast() function. Visit this blog if you're attempting to convert a text column to a date-time column.
Now that you know more about best practices, you can try some of these suggestions in your data preparation workflows when combining datasets. For more data flow-related blogs, visit my Oracle Analytics blog profile, and you can also ask questions about this and many other topics on the Oracle Analytics Community site.