Oracle Analytics Data Flow Cheat Sheet: Making Sense of Large Amounts of Data

January 25, 2022 | 4 minute read
Benjamin Arnulf
Senior Director, Product Strategy, Oracle Analytics
Text Size 100%:

Data Flows in Oracle Analytics enable you to customize, organize, and integrate your data to produce a curated dataset that your users can glean insights from right away. And because it’s a richly functional cloud service, it helps you do this automatically and repeatably so you can streamline your work at the same time that you increase value for your organization.  Win-win!

Let’s take a classic example: creating a meaningful view of sales data.  To do that, you might merge two datasets that contain quarterly sales data, strip out columns that you don't need, aggregate the value of your sales, and save the results in a new dataset named "Quarterly Sales Summary." But a Data Flow offers much more powerful functionality than merely stripping out unnecessary columns. You can add, join, prepare, and transform data, create machine learning models, apply database analytics, and much more.

"One of our strategic goals is to empower analysts to make large data stores meaningful," says Pravin Janardanam, Director, Product Management, Oracle Analytics. "Oracle Analytics achieves this with a comprehensive set of data preparation, data enrichment, and advanced data transformation capabilities built into Oracle Analytics Data Flow. Data Flow makes it easy to bring together data from various sources and lay a foundation for discovering insights."

With Data Flows in Oracle Analytics, you can run your analyses with no infrastructure to deploy or manage and no required prior knowledge of scripting or tools such as Spark. You can simply use your existing connections and Oracle Analytics fully leverages the power of your backend system automatically. Data Flows make running these transformations easy, repeatable, secure, and simple to share across your enterprise.

Data Flows use a variety of connectors and operations. There are four groups of functions at the heart of Data Flows in Oracle Analytics:

1. Data Ingestion, to aggregate data for analytics.

2. Data Preparation, to organize data in meaningful ways.

3. Machine Learning, to automatically detect meaning in data using Machine Learning algorithms.

4. Database Analytics, to present data in numerical and visual ways that provide new insights.

Each of these groups provides a variety of connectors and operations for handling large data stores. Looking more closely at each group, you see that each provides a number of functions to extract the most value from your data.

  1. Data Ingestion
  • Add Data - Add from hundreds of data source types
  • Join - Join datasets using matching or all rows
  • Union Rows - Union all, unique or common rows
  • Filter - Filter a dataset or add an expression filter
  • Aggregate - Aggregate and group by columns
  • Save Datasets - Save data to object storage or the database
  • Create Essbase Cube - Save data to a new Essbase Cube
  1. Data Preparation
  • Add Columns - Add new columns using one of over 100 functions
  • Select Columns - Select columns to keep in the dataset
  • Rename Columns - Rename all columns at once
  • Transform Columns - Transform columns using one of over 100 functions
  • Merge Columns - Merge multiple columns using delimiters
  • Split Columns - Split columns using delimiters and parts
  • Bin - Create bins for a measure
  • Group - Group values in a dimension
  • Branch - Branch the dataset into multiple datasets
  • Cumulative Value - Calculate cumulative values by measure
  • Time Series Forecast - Forecast data using ETS, Arima, S Arima
  • Analyze Sentiment - Analyze emotion from text data
  1. Machine Learning

Train Numeric Prediction - Train a model using four algorithm scripts:

  • CART
  • Linear Regression
  • Elastic Net Linear Regression
  • Random Forest

Train Multi-Classifier - Train a model using five algorithm scripts:

  • SVM
  • Neural Network
  • Naive Bayes
  • Random Forest
  • CART

Train Clustering - Train a model using two algorithm scripts:

  • K-Means
  • Hierarchical Clustering

Train Binary Classifier - Train a model using six algorithm scripts:

  • SVM
  • Neural Network
  • Naive Bayes
  • Logistic Regression
  • Random Forest
  • CART

Apply Model  - Apply an Analytics or Database model.

  1. Database Analytics includes both Oracle Database Analytics and Graph functions:

Oracle Database Analytics Functions:

  • Dynamic Clustering
  • Time Series
  • Sampling Data
  • Un-pivoting Data
  • Dynamic Anomaly Detection
  • Frequent Itemsets
  • Text Tokenization

Oracle Database Graph Functions, including:

  • Sub Graph
  • Clustering
  • Shortest Path
  • Node Ranking

As you can see, there's great power and functionality that can take you from data collection through data analytics. We're adding more functions to Data Flows to expand our anomaly detection offerings and much more. Many of these features are powered by the intelligence of Oracle Machine Learning algorithms, which is the subject of another blog.

Schedule a meeting today to talk to an Oracle Analytics product team member about how you can gain new insights from your data with Oracle Analytics.

Benjamin Arnulf

Senior Director, Product Strategy, Oracle Analytics

Benjamin Arnulf is Senior Director, Product Strategy covering Oracle Analytics and AI.

Previous Post

Oracle Analytics ML Algorithms: Make Sense of Your Big Data

Benjamin Arnulf | 4 min read

Next Post

Get the new guide to gaining broader insights

Emily Cikovsky | 3 min read