Oracle Analytics provides powerful self-service visual data-editing capabilities to let you quickly and easily clean, normalize, and enrich data for analysis. This post provides tips to help you when preparing datasets for analysis.
Removing Columns from a Spreadsheet
When you have a dataset that has a Microsoft Excel spreadsheet as the source, removing columns that you no longer need can be a little challenging. You can’t remove them directly from the workbook by using the Reload Data option; you'll see an error message stating that the new spreadsheet doesn't match the old one. And the Replace Data option requires you to create a dataset. The trick to solve this is that if you want to remove columns that you no longer need from a spreadsheet and keep using the same dataset, you must open the dataset in the Dataset editor and open Edit Definition for that dataset.
Suppose that you have a dataset with a source spreadsheet with 18 columns and you need to remove one of those columns. In this example, you need to remove the Product Container column.
Open the dataset and click Edit Definition.
Then click Select beside the Uploaded File field.
Select the spreadsheet with the removed columns and click OK.
The new dataset is ingested and profiled. If there are any transforms for that column, they are automatically deleted from the transform script. And that's it. You've successfully removed any unwanted columns.
Note: Before removing columns, ensure that those columns aren't used in any workbooks.
Default Format for Number and Date Columns
Did you know that as the dataset author, you can set a default format for important number and date columns? This default format is inherited by all the workbooks that use the dataset, but it can also be overridden at the workbook level to customize formatting for different visualizations.
Setting a Default Format for Numbers
To set a default format for numbers, click the number column in the Dataset editor, and click the pound sign (#). You see the property of Number Format set to Auto. Click the Auto link and a drop-down menu is displayed where you can select the appropriate format. For example, you can set the default format for the Profit column to Currency. You can override other properties of currency, such as the currency type, the thousand separator, the number of decimal places, and the abbreviation setting.
Setting a Default Format for Dates
To set a default format for dates, click the date column in the Dataset editor, and click the clock sign (🕓). You see the property of Format set to Auto. Click the Auto link and a drop-down menu is displayed where you can select the desired date format. For example, you can set the default format for the Order_Date column to YYYY/MM/DD. If you'd like, you can also change the Show By property, which allows the display of only a portion of the date, such as only month or the number of days in the year, and similar.
Overriding the Default Format in the Dataset in Workbooks
Once the default formats have been set in the dataset, users can override them on individual visualizations based on their requirements. If you select the column from the Data Elements Tree, the properties panel shows the default formats and you can’t override them globally. To override the default formats in workbook visualizations, follow these steps for each visualization. Select the visualization and notice that a visualization Properties panel is displayed. Locate the pound icon (#) for number or the clock icon (🕓) for dates, and you can change many of the default format settings to your requirements.
Controlling the Placement of Calculated Columns
Previously, calculated columns that didn't involve a particular column were placed at the end of the column list and users had no control over that placement. A new feature in the March Update allows you to control the placement of a new calculated column by allowing you to select an existing column prior to clicking the Plus (+) icon. By clicking an existing column, the new calculated column is placed to the right of the selected column. In this example, you want to add a new calculated column with a very simple literal 1 and place it between the Profit and Quantity_Ordered columns. To do that, you first click to select the Profit column, even though it doesn't participate in the formula; it's an indicator to the system to place the new calculated column to the right of that column. Next add your calculation, and click Add Step. Notice that the new1 column was added to the right of the Profit column.
Multi-Select in Metadata View to Change Properties
Are you always needing to change the Treat-As or Aggregation type in multiple columns in your new datasets? Did you know that you can select multiple columns and make those property changes in one action? As of the January Update, this new productivity feature is available in the Metadata View of the Dataset editor.
To use it, select the first column, press and hold the Command key on Mac or Ctrl key on Windows, then click the items (they don't need to be next to each other). To select multiple items that are adjacent, click the first column, then press the Shift key (both Mac and Windows) and click the last item. Once the items are selected, they're highlighted. You can now make the appropriate property changes using the first column that you selected. Note that you can’t use the Properties panel on the left; you must make the change on the Metadata View. In this example, you change the aggregation type for multiple measure columns all at once. The editor won't force you to select the same column types, but they apply the property changes to the columns where that property is relevant. You can also make more than one property change for the selected group, such as converting an attribute to a measure and then selecting the aggregation type for the same group without needing to select it.
Call to Action
I hope you found some of these tricks useful in helping you quickly prepare your datasets for analysis. Now that you know about these helpful tricks in the Oracle Analytics Dataset editor, I challenge you to go out and try them yourself! Whether you're working with small spreadsheets or big tables from a data warehouse, there's nothing better than spending less time wrangling data and spending more time where the value is — analyzing your data! For details on working with datasets, see About the Dataset Editor.