Tuesday Jul 12, 2016

Mining Structured Data and Unstructured Data using Oracle Advanced Analytics 12c

Oracle Advanced Analytics (OAA) Database Option leverages Oracle Text, a free feature of the Oracle Database, to pre-process (tokenize) unstructured data for ingestion by the OAA data mining algorithms.  By moving, parallelized implementations of  machine learning algorithms inside the Oracle Database, data movement is eliminated and we can leverage other strengths of the Database such as Oracle Text (not to mention security, scalability, auditing, encryption, back up, high availability, geospatial data, etc.

This Mining Structured Data and Unstructured Data using Oracle Advanced Analytics 12c YouTube video presents an overview of the capabilities for combining and performing data mining on both structured and unstructured data.  The YouTube includes several quick demonstrations on classification and clustering using unstructured data and provides instructions and links on how to get started--either on premise or on the Oracle Cloud.  

I hope you find this helpful and a pleasure to watch.   

Presentation Slides.

You can also access similar YouTube videos at this Oracle Data Mining at the Movies blog posting.

Follow CharlieDataMine on Twitter.

Thanks for watching!


Sr. Dir. of Product Management, Oracle Advanced Analytics and Data Mining


Wednesday Mar 09, 2016

Learn Predictive Analytics in 2 Days - New Oracle University Course!

What you will learn:  This Predictive Analytics using Oracle Data Mining Ed 1 training will review the basic concepts of data mining. Expert Oracle University instructors will teach you how to leverage the predictive analytical power of Oracle Data Mining, a component of the Oracle Advanced Analytics option.

Learn To:

  • Explain basic data mining concepts and describe the benefits of predictive analysis.
  • Understand primary data mining tasks, and describe the key steps of a data mining process.
  • Use the Oracle Data Miner to build, evaluate, apply, and deploy multiple data mining models.
  • Use Oracle Data Mining's predictions and insights to address many kinds of business problems.
  • Deploy data mining models for end-user access, in batch or real-time, and within applications.

Benefits to You

When you've completed this course, you'll be able to use the Oracle Data Miner 4.1, the Oracle Data Mining "workflow" GUI, which enables data analysts to work directly with data inside the database. The Data Miner GUI provides intuitive tools that help you explore the data graphically, build and evaluate multiple data mining models, apply Oracle Data Mining models to new data, and deploy Oracle Data Mining's predictions and insights throughout the enterprise.

Oracle Data Miner's SQL APIs - Get Results in Real-Time

Oracle Data Miner's SQL APIs automatically mine Oracle data and deploy results in real-time. Because the data, models, and results remain in the Oracle Database, data movement is eliminated, security is maximized and information latency is minimized.


  • Course Objectives
  • Suggested Course Prerequisites
  • Suggested Course Schedule
  • Class Sample Schemas
  • Practice and Solutions Structure
  • Review location of additional resources

Predictive Analytics and Data Mining Concepts

  • What is the Predictive Analytics?
  • Introducting the Oracle Advanced Analytics (OAA) Option?
  • What is Data Mining?
  • Why use Data Mining?
  • Examples of Data Mining Applications
  • Supervised Versus Unsupervised Learning
  • Supported Data Mining Algorithms and Uses

Understanding the Data Mining Process

  • Common Tasks in the Data Mining Process
  • Introducing the SQL Developer interface

Introducing Oracle Data Miner 4.1

  • Data mining with Oracle Database
  • Setting up Oracle Data Miner
  • Accessing the Data Miner GUI
  • Identifying Data Miner interface components
  • Examining Data Miner Nodes
  • Previewing Data Miner Workflows

Using Classification Models

  • Reviewing Classification Models
  • Adding a Data Source to the Workflow
  • Using the Data Source Wizard
  • Using Explore and Graph Nodes
  • Using the Column Filter Node
  • Creating Classification Models
  • Building the Models
  • Examining Class Build Tabs

Using Regression Models

  • Reviewing Regression Models
  • Adding a Data Source to the Workflow
  • Using the Data Source Wizard
  • Performing Data Transformations
  • Creating Regression Models
  • Building the Models
  • Comparing the Models
  • Selecting a Model

Using Clustering Models

  • Describing Algorithms used for Clustering Models
  • Adding Data Sources to the Workflow
  • Exploring Data for Patterns
  • Defining and Building Clustering Models
  • Comparing Model Results
  • Selecting and Applying a Model
  • Defining Output Format
  • Examining Cluster Results

Performing Market Basket Analysis

  • What is Market Basket Analysis?
  • Reviewing Association Rules
  • Creating a New Workflow
  • Adding a Data Source to the Workflow
  • Creating an Association Rules Model
  • Defining Association Rules
  • Building the Model
  • Examining Test Results

Performing Anomaly Detection

  • Reviewing the Model and Algorithm used for Anomaly Detection
  • Adding Data Sources to the Workflow
  • Creating the Model
  • Building the Model
  • Examining Test Results
  • Applying the Model
  • Evaluating Results

Mining Structured and Unstructured Data

  • Dealing with Transactional Data
  • Handling Aggregated (Nested) Data
  • Joining and Filtering data
  • Enabling mining of Text
  • Examining Predictive Results

Using Predictive Queries

  • What are Predictive Queries?
  • Creating Predictive Queries
  • Examining Predictive Results

Deploying Predictive models

  • Requirements for deployment
  • Deployment Options
  • Examining Deployment Options

Monday Feb 29, 2016

Guest Lecture on Big Data & Analytics to U. Kansas Business School Students

Recently, I was asked by a friend and colleague, Chris Claterbos, Lecturer at University of Kansas' Business School, to deliver a guest lecture to his business analytics students.  

In preparation, so as to not make it an entirely Oracle "product" presentation, I tried to gather some general information on the big data + analytics market, job opportunities & careers and future musings about where the industry is headed.  I liked resulting presentation so am posting and sharing it here.  

U. Kansas Guest Lecture on Big Data Analytics with Oracle's Advanced Analytics, Big Data SQL and Cloud

  • Big Data + Analytics “Phenomenon”
  • Careers in Big Data Analytics
  • Product
    • Oracle Advanced Analytics Overview & Features/Benefits
    • Brief Demos
  • Example Customer References
  • Applications “Powered by OAA”
  • Getting Started
  • Q & A

Enjoy!  Hopefully all you all become data'n'science stars!  


Wednesday Feb 03, 2016

Links to Presentations: BIWA Summit'16 - Big Data + Analytics User Conference Jan 26-28, @ Oracle HQ Conference Center

We had a great www.biwasummit.org event with ~425 attendees, in depth technical presentations delivered by experts and even had several 2 hour Hands on Labs training classes that used the Oracle Database Cloud!  Watch for more coverage of event in various Oracle marketing and partner content venues.

Many thanks to all the BIWA board of directors and many volunteers who have put in so much work to make this BIWA Summit the best BIWA user event ever.  Mark your calendars for BIWA Summit’17, January 31, Feb. 1 & Feb. 2, 2017.  We’ll be announcing Call for Abstracts in the future, so please direct your best customers and speakers to submit.  We’re aiming to continue to make BIWA + Spatial + YesSQL Summit the best focused user gathering for sharing best practices for novel and interesting use cases of Oracle technologies.

BIWA is an IOUG SIG run by entirely by customers, partners and Oracle employee volunteers.  We’re always looking for people who would like to be involved.  Let me know if you’d like to contribute to the planning and organization of future BIWA events and activities.

See everyone at BIWA’17!

Charlie, on behalf of the entire BIWA board of directors  (charlie.berger@oracle.com)

(see www.biwasummit.org for more information)

See List of BIWA Summit'16 Presentations below.  Click on Details to access the speaker’s abstract and download the files (assuming the speaker has posted them for sharing).

We now have a schedule at a glance to show you all the sessions in a tabular agenda.


See bottom of page for the Session Search capability

Below is a list of the sessions and links to download most of the materials for the various sessions.  Click on the DETAILS button next to the session you want to download, then the page should refresh with the session description and (assuming the presenter uploaded files, but be aware that files may be limited to 5MB) you should see a list of files for that session.  See the full list below:

Advanced Analytics

Presentations (Click on Details to access file if submitted by presenter)

Dogfooding – How Oracle Uses Oracle Advanced Analytics To Boost Sales Efficiency


Oracle Modern Manufacturing - Bridging IoT, Big Data Analytics and ERP for Better Results


Predictive Modelling and Forecasting using OER


Enabling Clorox as Data Driven Enterprise


Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold


Large Scale Machine Learning with Big Data SQL, Hadoop and Spark


Stubhub and Oracle Advanced Analytics


Fiserv Case Study: Using Oracle Advanced Analytics for Fraud Detection in Online Payments


Advanced Analytics for Call Center Operations


Machine Learning on Streaming Data via Integration of Oracle R Enterprise and Oracle Stream Explorer


Learn Predictive Analytics in 2 hours!! Oracle Data Miner 4.0 Hands on Lab


Scaling R to New Heights with Oracle Database


Predictive Analytics using SQL and PL/SQL


Big Data Analytics with Oracle Advanced Analytics 12c and Big Data SQL and the Cloud


Improving Predictive Model Development Time with R and Oracle Big Data Discovery


Oracle R Enterprise 1.5 - Hot new features!


Is Oracle SQL the best language for Statistics


BI and Visualization

Presentations (Click on Details to access file if submitted by presenter)

Electoral fraud location in Brazilian General Elections 2014


The State of BI


Case Study of Improving BI Apps and OBIEE Performance


Preparing for BI 12c Upgrade


Data Visualization at Sound Exchange – a Case Study


Integrating OBIEE and Essbase, Why it Makes Sense


The Dash that changed a culture


Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database option


Oracle Data Visualization vs. Answers: The Cage Match


What's New With Oracle Business Intelligence 12c


Workforce Analytics Leveraging Oracle Business Intelligence Cloud Serivces (BICS)


Defining a Roadmap for Migrating to Oracle BI Applications on ODI


See What’s There and What’s Coming with BICS & Data Visualization


Free form Data Visualization, Mashup BI and Advanced Analytics with BI 12c


Oracle Data Visualization Cloud Service Hands-On Lab with Customer Use Cases


On Metadata, Mashups and the Future of Enterprise BI


OBIEE 12c and the Leap Forward in Lifecycle Management


Supercharge BI Delivery with Continuous Integration


Visual Analyzer and Best Practices for Data Discovery


BI Movie Magic: Maps, Graphs, and BI Dashboards at AMC Theatres


Oracle Business Intelligence (OBIEE) the Smart View Way


Big Data

Presentations (Click on Details to access file if submitted by presenter)

Oracle Big Data: Strategy and Roadmap


Oracle Modern Manufacturing - Bridging IoT, Big Data Analytics and ERP for Better Results


Leveraging Oracle Big Data Discovery to Master CERN’s Control Data


Enrich, Transform and Analyse Big Data using Big Data Discovery and Visual Analyzer


Oracle Big Data SQL: Unified SQL Analysis Across the Big Data Platform


High Speed Video Processing for Big Data Applications


Enterprise Data Hub with Oracle Exadata and Oracle Big Data Appliance


How to choose between Hadoop, NoSQL or Oracle Database


Analytical SQL in the Era of Big Data


Cloud Computing

Presentations (Click on Details to access file if submitted by presenter)

Oracle DBaaS Migration Road Map


Centralizing Spatial Data Management with Oracle Cloud Databases


End Users data in BI - Data Mashup and Data Blending with BICS , DVCS and BI 12c


Oracle BI Tools on the Cloud--On Premise vs. Hosted vs. Oracle Cloud


Hybrid Cloud Using Oracle DBaaS: How the Italian Workers Comp Authority Uses Graph Technology


Build Your Cloud with Oracle Engineered Systems


Safe Passage to the CLOUD – Analytics


Your Journey to the Cloud : From Dedicated Physical Infrastructure to Cloud Bursting


Data Warehousing and ETL

Presentations (Click on Details to access file if submitted by presenter)

Getting to grips with SQL Pattern Matching


Making SQL Great Again (SQL is Huuuuuuuuuuuuuuuge!)


Controlling Execution Plans (without Touching the Code)


Taking Full Advantage of the PL/SQL Result Cache


Taking Full Advantage of the PL/SQL Compiler


Advanced SQL: Working with JSON Data


Oracle Database In-Memory Option Boot Camp: Everything You Need to Know


Best Practices for Getting Started With Oracle Database In-Memory


Extreme Data Warehouse Performance with Oracle Exadata


Real-Time SQL Monitoring in Oracle Database 12c


A Walk Through the Kimball ETL Subsystems with Oracle Data Integration


MySQL 5.7 Performance: More Than 1.6M SQL Queries per Second


Implement storage tiering in Data warehouse with Oracle Automatic Data Optimization


Edition-Based Redefinition Case Study


12-Step SQL Tuning Method


Where's Waldo? Using a brute-force approach to find an Execution Plan the CBO hides


Delivering an Enterprise-Wide Standard Chart of Accounts at GE with Oracle DRM


Agile Data Engineering: Introduction to Data Vault Data Modeling


Worst Practice in Data Warehouse Design


Same SQL Plan, Different Performance


Why Use PL/SQL?


Transforming one table to another: SQL or PL/SQL?


Understanding the 10053 Trace


Analytic Views - Bringing Star Queries into the Twenty-First Century


The Place of SQL in the Hybrid World


The Next Generation of the Oracle Optimizer


Internet of Things

Presentations (Click on Details to access file if submitted by presenter)

Oracle Modern Manufacturing - Bridging IoT, Big Data Analytics and ERP for Better Results


Meet Your Digital Twin


Industrial IoT and Machine Learning - Making Wind Energy Cost Competitive


Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold


Big Data and the Internet of Things in 2016: Beyond the Hype


IoT for Big Machines


The State of Internet of Things (IoT)


Oracle Spatial Summit

Presentations (Click on Details to access file if submitted by presenter)

Build Your Own Maps with the Big Data Discovery Custom Visualization Component


Massively Parallel Calculation of Catchment Areas in Retail


Dismantling Criminal Networks with Graph and Spatial Visualization and Analysis


Best Practices for Developing Geospatial Apps for the Cloud


Map Visualization in Analytic Apps in the Cloud, On-Premise, and Mobile


Best Practices, Tips and Tricks with Oracle Spatial and Graph


Delivering Smarter Spatial Data Management within Ordnance Survey, UK


Deploying a Linked Data Service at the Italian National Institute of Statistics


ATLAS - Utilizing Oracle Spatial and Graph with Esri for Pipeline GIS and Linear Asset Management


Oracle Spatial 12c as an Applied Science for Solving Today's Real-World Engineering Problems


Assembling a Large Scale Map for the Netherlands Using Oracle 12c Spatial and Graph


Using Open Data Models to Rapidly Develop and Prototype a 3D National SDI in Bahrain


Implementation of LBS services with Oracle Spatial and Graph and MapViewer in Zain Jordan


Interactive map visualization of large datasets in analytic applications


Gain Insight into Your Graph Data -- A hands on lab for Oracle Big Data Spatial and Graph


Applying Spatial Analysis To Big Data


Big Data Spatial: Location Intelligence, Geo-enrichment and Spatial Analytics


What’s New with Spatial and Graph? Technologies to Better Understand Complex Relationships


Graph Databases: A Social Network Analysis Use Case


High Performance Raster Database Manipulation and Data Processing with Oracle Spatial and Graph


3D Data Management - From Point Cloud to City Model


The Power of Geospatial Visualization for Linear Assets Using Oracle Enterprise Asset Management


Oracle Spatial and Graph: New Features for 12.2


Fast, High Volume, Dynamic Vehicle Routing Framework for E-Commerce and Fleet Management


Managing National Broadband Infrastructure at Turk Telekom with Oracle Spatial and Graph



Presentations (Click on Details to access file if submitted by presenter)

Taking Full Advantage of the PL/SQL Compiler


Taking Full Advantage of the PL/SQL Result Cache


Meet Your Digital Twin


Making SQL Great Again (SQL is Huuuuuuuuuuuuuuuge!)


Lightning Round for Vendors


Monday Oct 12, 2015

NHS Business Services Authority Gains Better Insight into Data, Identifies circa GBP100 Million (US$156 Million) in Potential Savings in Just Three Months

NHS Business Services Authority Gains Better Insight into Data, Identifies circa GBP100 Million (US$156 Million) in Potential Savings in Just Three Months


The NHS Business Services Authority (NHSBSA) is a special health authority and an arm’s length body of the Department of Health for England. It provides a range of critical central services to NHS organizations, contractors, patients, and the public. Services include managing the NHS Pension schemes in England and Wales, managing payments to primary care dental and pharmacy contractors, and administering the European Health Insurance Card (EHIC).

The NHS budget for 2015/16 is approximately GBP116 billion (US$179 billion) and the total funds administered by the NHSBSA (including those for the NHS Pension schemes) amount to circa GBP32 billion (US$48 billion). The Department of Health asked the NHSBSA to take a proactive role to identify opportunities to reduce costs and eliminate waste. One way to do this was to find better ways to use the vast volumes of data already collected and held within the organization to help reduce fraud and error throughout the health service.

The NHSBSA needed a new, centralized solution that would enable it to gain better value from its data which is spread across a disparate set of IT systems, data, storage, and analytical capabilities. To achieve this, it chose an end-to-end Oracle solution including Oracle Advanced AnalyticsOracle Exadata Database MachineOracle Exalytics In-Memory MachineOracle Endeca Information Discovery, and Oracle Business Intelligence Enterprise Edition.

With this Oracle solution, the NHSBSA established its Data Analytics Learning Laboratory (DALL), investing in both technology and expertise to create insight from its data. Within the first three months of operation, the organization identified circa GBP100 million (US$156 million) in potential savings.

Uncovering Savings in Dentistry

A word from NHS Business Services Authority

  • “Oracle Advanced Analytics’ data mining capabilities and Oracle Exalytics’ performance really impressed us. The overall solution is very fast, and our investment very quickly provided value. We can now do so much more with our data, resulting in significant savings for the NHS as a whole.” – Nina Monckton, Head of Information Services, NHS Business Services Authority

The NHSBSA used analytics to identify significant savings within NHS dental services and find instances of activities which do not demonstrate good value for money.

“With Oracle Advanced Analytics, it is much easier to detect anomalies in behaviors. We used anomaly detection to discover where there might be evidence of inappropriate behavior in dentists’ claims, enabling NHS commissioners to follow up and challenge their activities,” explained Nina Monckton, head of information services, NHSBSA.

Preventing Fraud for European Health Insurance Card

The EHIC is available to all European citizens covered by a statutory social security scheme and entitles them to free healthcare while visiting other European countries. 

During analysis of EHIC data, the NHSBSA discovered commercial addresses being used fraudulently to apply for EHIC cards and uncovered the use of invalid NHS and National Insurance numbers to apply for a card. 

“We used Oracle Exalytics and Oracle Business Intelligence for the EHIC application to improve the front-end validation process, prevent fraud, and blacklist addresses showing suspicious activities,” Monckton said.

Analyzing Billions of Records in Minutes

The NHSBSA receives data relating to more than one billion prescription items dispensed in primary care settings each year. Previously, the NHSBSA did not have the computing power to analyze this data at transaction level.

The NHSBSA can now analyze billions of records at one time, and by analyzing much larger sets of patient data, the NHSBSA can provide insight that is helping to improve standards of care throughout the health service.

“Previously, our information analysts did not have the ability to directly query data as it was mainly held in live operational systems. Now that we are able to transfer data to our Exadata environment, we have dramatically improved our ability to deliver value from our data,” Monckton said.

Analyzing Unstructured Text to Measure Satisfaction

Improving Data Matching To Save Millions of Dollars

In England, some people are entitled to free medical prescriptions or dental treatment from the NHS. The NHSBSA works with the Department of Work and Pensions (DWP) to establish that those patients declaring that they are exempt from a charge for dental treatment and/or medical prescriptions are claiming correctly. Using Oracle Exalytics to compare datasets, the NHSBSA reduced the rate of non-matching records for dentistry from 15% to just 5%.

The Role of Data Governance

Data is now moving to the heart of all NHSBSA programs. As a result of the organization’s new analytics capability, teams have a better understanding of what they can do with the data and are more careful about what data they collect. 

“We now know that if we collect the right data at the start of a program, we can measure what is working down the line. We are starting to change the culture of the organization around our data governance. There has been a massive shift. Data is now central to all our new programs, and data governance is at the heart of everything we do,” Monckton said.

Using the Data Analytics Learning Laboratory to Achieve Strategic Goals

The NHSBSA’s data analytics investment is helping the organization to achieve its 5 year strategic goals, which include helping to save GBP1 billion (US$1.56 billion) for NHS patients, reducing unit costs by 50%, improving service and delivering great results for customers, and deriving insight from data to drive change.

“With our newly established Data Lab in place, we can add even more value to the NHS. I cannot begin to describe how significant that has been. This project is really helping us to achieve our strategic goals. In addition, we are working in a different way now and it has even helped with how people interact and function in the workplace.

“We’ve had a very positive response, and our chief executive is extremely impressed with our achievements and the results we have shown so far. As a result, management is recommending that our suppliers and partners come to see what we are doing to learn from our experiences,” Monckton said.

Over the next six months, the DALL team has a large number of analytics projects in the pipeline and is looking to help other areas of the business to better leverage their data. The organization will focus on how it can use Oracle Business Intelligence Enterprise Edition with business users. In addition, the NHSBSA is investigating how it might share data and its analytical ability with other government organizations to drive further value from its investment.


  • Use new insight gathered from data to help identify cost savings and meet NHSBSA strategic goals
  • Identify and prevent healthcare fraud and benefit eligibility errors to save costs
  • Leverage existing data to transform business and productivity


Oracle Product and Services

  • Identified up to GBP100 million (US$156 million) that could potentially be saved across the NHS through benefit fraud and error reduction, by deploying new analytics infrastructure
  • Identified and implemented changes to prevent fraudulent European Health Insurance Card (EHIC) applications
  • Used data matching to identify savings that can be made through the recovery of money from patients claiming exemption from charges for dental treatment or prescriptions when not eligible to do so
  • Used anomaly detection to uncover fraudulent activity where some dentists split a single course of treatment into multiple parts and presented claims for multiple treatments
  • Analyzed unstructured text to measure employee satisfaction in more detail and found a direct link between those who felt less engaged at work and those more likely to take time off sick
  • Analyzed billions of records at one time to measure longer-term patient journeys and to analyze drug prescribing patterns to improve patient care
  • Established a new Data Analytics Learning Laboratory (DALL) that uses data and analytics to drive action and significant savings for the NHS
  • Implemented Oracle Advanced Analytics, Oracle Exadata Database Machine, Oracle Exalytics In-Memory Machine, Oracle Endeca Information Discovery, and Oracle Business Intelligence Enterprise Edition to deliver fast analysis and data mining for NHS and wider government departments

Why Oracle

“We chose Oracle because the solution could cope with very large data volumes running into billions of rows and could scale as volumes increase. In addition, the Oracle solution required no IT team support to run the queries, which enables our team of data analysts to be self-sufficient. Oracle Exalytics’ in-memory capability gave us the speed we required, and Oracle’s engineered systems accelerated deployment and reduced risk.

“Working with Oracle has been a very positive experience. The team has been incredibly responsive and provided a number of experts to help us get up and running as quickly as possible. With one vendor providing the whole solution, it’s very easy for us. If we need help, we know where to go,” Monckton said.

Implementation Process

Oracle ran a proof of concept (POC) to show the speed and capability of the proposed end-to-end solution. The POC used publically available data sets for NHS prescription data. It covered 50 million prescribed items, 300 million records, and six months of data. The team concentrated on finding anomalies in the data and carrying out further analysis to understand them before presenting the findings in a clear and straightforward way.

Following the POC, Oracle worked with NHSBSA and its data center partner, Capita, to complete the implementation. During implementation, Oracle provided the NHSBSA with access to a virtual environment. This enabled the team to get some experience with the tools before completing the implementation. As such, NHSBSA was familiar and confident with using the new analytics tools from day one, saving considerable time and gaining immediate value.

NHSBSA identified which data it should use for analysis and transferred it across to its Oracle Exadata environment. To date it has transferred more than 15 billion rows of data into Oracle Exadata. The prescription services database with 14 billion rows of data is the largest exported data source using 400 gigabytes. The export took 10 hours to complete with Oracle as the source database. 

Advice from NHSBSA

  • Have a clear plan for the first six months before you begin your implementation
  • Ensure you have buy-in from key stakeholders
  • Choose easy areas to start with, so you can demonstrate positive results quickly and prove the value of the solution to others
  • Build knowledge within your team through training and Oracle events; this helps staff to think differently about the possibilities of using data
  • Get help from the experts: talk to your existing suppliers, go to analytics events, and talk to other organizations who have implemented analytics
  • It’s never too early to think about data governance and data quality: recruit a data standards manager to create data governance policies and identify data leads around the business


Friday Sep 25, 2015

Oracle Advanced Analytics at Oracle Open World 2015

While there are a lot of OOW talks that include the work “analytics” or “big data”, this is my short list of sessions, training and demos that primarily focus on Oracle Advanced Analytics. Hope to see you there!


Oracle Advanced Analytics at OOW'15 Highlights

Big Data Analytics with Oracle Advanced Analytics12c and Big Data SQL &
Fiserv Case Study: Fraud Detection in Online Payments [CON8743]

Tuesday, Oct 27, 5:15 p.m. | Moscone South—307

· Charles Berger, Sr. Director of Product Management, Advanced Analytics and Data Mining, Oracle

· Miguel M Barrera, Director of Risk Analytics and Strategy

· Julia Minkowski, Risk Analytics Manager

Oracle Advanced Analytics 12c delivers parallelized in-database implementations of data mining algorithms and integration with R. Data analysts use Oracle Data Miner GUI and R to build and evaluate predictive models and leverage R packages and graphs. Application developers deploy Oracle Advanced Analytics models using SQL data mining functions and R. Oracle extends Oracle Database to an analytical platform that mines more data and data types, eliminates data movement, and preserves security to automatically detect patterns, anticipate customer behavior, and deliver actionable insights. Oracle Big Data SQL adds new big data sources and Oracle R Advanced Analytics for Hadoop provides algorithms that run on Hadoop. 

Fiserv manage risk for $30B+ in transfers, servicing 2,500+ US financial institutions, including 27 of the top 30 banks and prevents $200M in fraud losses every year.  When dealing with potential fraud, reaction needs to be fast.  Fiserv describes their use of Oracle Advanced Analytics for fraud prevention in online payments and shares their best practices and results from turning predictive models into actionable intelligence and next generation strategies for risk mitigation.  
Conference Session

OAA Demo Pod (#3581—Big Data Predictive Analytics with Oracle Advanced Analytics, R, and Oracle Big Data SQL   Moscone South

The Oracle Advanced Analytics database option embeds powerful data mining algorithms in Oracle Database’s SQL kernel and adds integration with R for solving big data problems such as predicting customer behavior, anticipating churn, detecting fraud, and performing market basket analysis. Data analysts work directly with database data, using the Oracle Data Miner workflow GUI (SQL Developer 4.1 ext.), SQL, or R languages and can extend Oracle Advanced Analytics’ functionally with R graphics and CRAN packages. Oracle Big Data SQL enables Oracle Advanced Analytics models to run on Oracle Big Data Appliance. Oracle R Advanced Analytics for Hadoop provides a powerful R interface over Hadoop and Spark with parallel-distributed predictive algorithms. Learn more in this demo.

Real Business Value from Big Data and Advanced Analytics [UGF4519]

Sunday, Oct 25, 3:30 p.m. | Moscone South—301

· Antony Heljula, Technical Director, Peak Indicators Limited

· Brendan Tierney, Principal Consultant, Oralytics

Attend this session to hear real case studies where big data and advanced analytics have delivered significant return on investment to a variety of Oracle customers. These solutions can pay for themselves within one year. Customer case studies include predicting which employees are likely to leave within the next 12 months, predicting which sales outlets are likely to suffer from out-of-stock products, predicting sales based on the weather forecast, and predicting which students are likely to withdraw early from their courses. A live demonstration illustrates the high-level process for implementing predictive business intelligence (BI) and its best practices.  User Group Forum Session

Customer Panel: Big Data and Data Warehousing [CON8741]

Wednesday, Oct 28, 4:15 p.m. | Moscone South—301

· Craig Fryar, Head of Wargaming Business Intelligence, Wargaming.net

· Manuel Martin Marquez, Senior Research Fellow and Data Scientist, Cern Organisation Européenne Pour La Recherche Nucléaire

· Jake Ruttenburg, Senior Manager, Digital Analytics, Starbucks

· Chris Wones, Chief Enterprise Architect, 8451

· Reiner Zimmermann, Senior Director, DW & Big Data Global Leaders Program, Oracle

In this session, hear how customers around the world are solving cutting-edge analytical business problems using Oracle Data Warehouse and big data technology. Understand the benefits of using these technologies together, and how software and hardware combined can save money and increase productivity. Learn how these customers are using Oracle Big Data Appliance, Oracle Exadata, Oracle Exalytics, Oracle Database In-Memory 12c, or Oracle Analytics to drive their business, make the right decisions, and find hidden information. The conversation is wide-ranging, with customer panelists from a variety of industries discussing business benefits, technical architectures, implementation of best practices, and future directions.  Conference Session

End-to-End Analytics Across Big Data and Data Warehouse for Data Monetization [CON3296]

Monday, Oct 26, 4:00 p.m. | Moscone West—2022

· Satya Bhamidipati, Senior Principal Advanced Analytics Market Dev, Business Analytics Product Group, Oracle

· Gokula Mishra, VP, Big Data & Advanced Analytics, Oracle

Organizations have used data warehouses to manage structured and operational data, which provides business analysts with the ability to analyze key internal data and spot trends. However, the explosion of newer data sources (big data) not only challenges the role of the traditional data warehouse in analyzing data from these diverse sources but also exposes limitations posed by traditional software and hardware platforms. This newer data can be combined with the data in the data warehouse and analyzed without creating another data silo and creating a hybrid data analytics structure. This presentation discusses the data and analytics platform architecture that enables this data monetization and presents various industry use cases.  Conference Session

Building Predictive Models for Identifying and Preventing Tax Fraud [CON3294]

Wednesday, Oct 28, 9:00 a.m. | Park Central—Concordia

· Brian Bequette, Managing Partner, TPS

· Satya Bhamidipati, Senior Principal Advanced Analytics Market Dev, Business Analytics Product Group, Oracle

According to a TIGTA Audit Report issued in February 2013, in 2012 alone, the IRS identified almost 642,000 incidents of identity theft affecting tax administration, a 38 percent increase since 2010. And this number continues to increase. Tax Processing Systems (TPS) consultants have focused on fraud detection and developed innovative solutions and proprietary algorithms for detecting fraud. In 2012, TPS formed a partnership with Oracle and has adapted its cloud-based methodologies and algorithms for use on the Oracle technology stack. Together, TPS and Oracle have created an end-to-end fraud detection solution that is effective, efficient, and accurate. This presentation focuses on the technology and the algorithms they have developed to detect fraud.  Conference Session

Oracle University Pre-OOW Course – Sunday, Oct. 25th

Using Data Mining Techniques for Predictive Analysis Course, Sunday October 25th

This session teaches students the basic concepts of data mining and how to leverage the predictive analytical power of data mining with Oracle Database by using Oracle Data Miner 12c. Students will learn how to explore the data graphically, build and evaluate multiple data models, apply data mining models to new data, and deploy data mining's predications and insights throughout the enterprise. All this can be performed on the data in Oracle Database on a real-time basis by using Oracle Data Miner SQL APIs. As the data, models, and results remain in Oracle Database, data movement is eliminated, security is maximized, and information latency is minimized.
See Oracle University at Oracle OpenWorld and Make the Most of Your Oracle OpenWorld and JavaOne Experience with Preconference Training by Oracle Experts

When: Sunday, October 25, 2015, 9 a.m.-4 p.m., with a one-hour lunch break
Where: Golden Gate University, 536 Mission Street, San Francisco, CA 94105 (three blocks from Moscone Center)
Cost: US$850 for a full day of training (cost includes light refreshments and a boxed lunch)

Instructor: Ashwin Agarwal… Read full bio

Target Audience: Data scientists, application developers, and data analysts

Course Objectives:

  • Understand the basic concepts and describe the primary terminology of data mining
  • Understand the steps associated with a data mining process
  • Use Oracle Data Miner 12c to perform data mining
  • Understand the options for deploying data mining predictive results

Course Topics:

  • Understanding the Data Mining Concepts
  • Understanding the Benefits of Predictive Analysis
  • Understanding Data Mining Tasks
  • Key Steps of a Data Mining Process (Includes Demo)
  • Using Oracle Data Miner to Build, Evaluate, and Apply Multiple Data Mining Models Includes Demo)
  • Using Data Mining Predictions and Insights to Address Various Business Problems (Includes Demo)
  • Predicting Individual Behavior (Includes Demo)
  • Predicting Values (Includes Demo)
  • Finding Co-Occurring Events (Includes Demo)
  • Detecting Anomalies (Includes Demo)
  • Learning How to Deploy Data Mining Results for Real-Time Access by End Users

Prerequisites: A working knowledge of the SQL language and Oracle Database design and administration

Also, on the Big Data + Analytics related products OTN pages, there is a “Must See” Program Guide. Clicking on the .pdf link http://www.oracle.com/technetwork/database/openworld2015pdf-2650488.pdf you’ll see the full list.

Friday Aug 07, 2015

Oracle Advanced Analytics Oracle University (OU) Classes in Cambridge, MA. September 28-Oct. 1, 2015

Oracle University has rescheduled their 2 day back to back Oracle Advanced Analytics OU Classes in Cambridge, MA.   Please help spread the word. 

Oracle Advanced Analytics combo-course (ODM + ORE) training

This is great opportunity for big data analytics customers and partners to learn hands on about using Oracle Advanced Analytics.  Vlamis, authorized OU instructor(s), will be teaching the OAA/ODM & OAA/ORE courses again and have been a great and knowledgeable OAA training and implementation partner. The courses are also during the week of Predictive Analytics World in Boston (Oracle will be exhibiting and speaking) so perhaps a good time for customers to come to Boston, perhaps use some OU credits, learn some new skills and focus on Oracle’s predictive analytics. 

Anyone (customers and Oracle Employees) can register through us at http://www.vlamis.com/training/ or via their normal OU connections. They should be able to utilize OU training credits for either course.  Oracle Employees should register through the Employee Self Service from Self Service Applications

Please forward to any appropriate Oracle Advanced Analytics customers and partners.  Thanks!


Sunday Jul 26, 2015

Big Data Analytics with Oracle Advanced Analytics: Making Big Data and Analytics Simple white paper

Big Data Analytics with Oracle Advanced Analytics:

Making Big Data and Analytics Simple

Oracle White Paper  |  July 2014 

Executive Summary:  Big Data Analytics with Oracle Advanced Analytics

(Click HERE to read entire Oracle white paper)   (Click HERE to watch YouTube video)

The era of “big data” and the “cloud” are driving companies to change.  Just to keep pace, they must learn new skills and implement new practices that leverage those new data sources and technologies.  Increasing customer expectations from sharing their digital exhaust with corporations in exchange for improved customer interactions and greater perceived value are pushing companies forward.  Big data and analytics offer the promise to satisfy these new requirements.  Cloud, competition, big data analytics and next-generation “predictive” applications are driving companies towards achieving new goals of delivering improved “actionable insights” and better outcomes.  Traditional BI & Analytics approaches don’t deliver these detailed predictive insights and simply can’t satisfy the emerging customer expectations in this new world order created by big data and the cloud.

Unfortunately, with big data, as the data grows and expands in the three V’s; velocity, volume and variety (data types), new problems emerge.  Data volumes grow and data becomes unmanageable and immovable.  Scalability, security, and information latency become new issues.  Dealing with unstructured data, sensor data and spatial data all introduce new data type complexities.  

Traditional advanced analytics has several information technology inherent weak points: data extracts and data movement, data duplication resulting in no single-source of truth, data security exposures, separate and many times, depending on the skills of the data analysts/scientists involved, multiple analytical tools (commercial and open source) and languages (SAS, R, SQL, Python, SPSS, etc.).  Problems become particularly egregious during a deployment phase when the worlds of data analysis and information management collide.   

Traditional data analysis typically starts with a representative sample or subset of the data that is exported to separate analytical servers and tools (SAS, R, Python, SPSS, etc.) that have been especially designed for statisticians and data scientists to analyze data.  The analytics they perform range from simple descriptive statistical analysis to advanced, predictive and prescriptive analytics.  If a data scientist builds a predictive model that is determined to be useful and valuable, then IT needs to be involved to figure out deployment and enterprise deployment and application integration issues become the next big challenge. The predictive model(s)—and all its associated data preparation and transformation steps—have to be somehow translated to SQL and recreated inside the database in order to apply the models and make predictions on the larger datasets maintained inside the data warehouse.  This model translation phase introduces tedious, time consuming and expensive manual coding steps from the original statistical language (SAS, R, and Python) into SQL.  DBAs and IT must somehow “productionize” these separate statistical models inside the database and/or data warehouse for distribution throughout the enterprise.  Some vendors will charge for specialized products and options for just for predictive model deployment.  This is where many advanced analytics projects fail.  Add Hadoop, sensor data, tweets, and expanding big data reservoirs and the entire “data to actionable insights” process becomes more challenging.  

Not with Oracle.  Oracle delivers a big data and analytics platform that eliminates the traditional extract, move, load, analyze, export, move load paradigm.  With Oracle Database 12c and the Oracle Advanced Analytics Option, big data management and big data analytics are designed into the data management platform from the beginning.  Oracle’s multiple decades of R&D investment in developing the industry’s leading data management platform, Oracle SQL, Big Data SQL, Oracle Exadata, Oracle Big Data Appliance and integration with open source R are seamlessly combined and integrated into a single platform—the Oracle Database.  

Oracle’s vision is a big data and analytic platform for the era of big data and cloud to:

  • Make big data and analytics simple (for any data size, on any computer infrastructure and any variety of data, in any combination) and

  • Make big data and analytics deployment simple (as a service, as a platform, as an application)

Oracle Advanced Analytics offers a wide library of powerful in-database algorithms and integration with open source R that together can solve a wide variety of business problems and can be accessed via SQL, R or GUI.  Oracle Advanced Analytics, an option to the Oracle Database Enterprise Edition 12c, extends the database into an enterprise-wide analytical platform for data-driven problems such as churn prediction, customer segmentation, fraud and anomaly detection, identifying cross-sell and up-sell opportunities, market basket analysis, and text mining and sentiment analysis.  Oracle Advanced Analytics empowers data analyst, data scientists and business analysts to more extract knowledge, discover new insights and make informed predictions—working directly with large data volumes in the Oracle Database.   

Data analysts/scientists have choice and flexibility in how they interact with Oracle Advanced Analytics.  Oracle Data Miner is an Oracle SQL Developer extension designed for data analysts that provides an easy to use “drag and drop” workflow GUI to the Oracle Advanced Analytics SQL data mining functions (Oracle Data Mining).  Oracle SQL Developer is a free integrated development environment that simplifies the development and management of Oracle Database in both traditional and Cloud deployments. When Oracle Data Miner users are satisfied with their analytical methodologies, they can share their workflows with other analysts and/or generate SQL scripts to hand to their DBAs to accelerate model deployment.  Oracle Data Miner also provides a PL/SQL API for workflow scheduling and automation.  

R programmers and data scientists can use the familiar open source R statistical programming language console, RStudio or any IDE to work directly with data inside the database and leverage Oracle Advanced Analytics’ R integration with the database (Oracle R Enterprise).  Oracle Advanced Analytics’ Oracle R Enterprise provides transparent SQL to R translation to equivalent SQL and Oracle Data Mining functions for in-database performance, parallelism, and scalability—this making R ready for the enterprise.  

Application developers, using the ODM SQL data mining functions and ORE R integration can build completely automated predictive analytic solutions that leverage the strengths of the database and the flexibly of R to integrate Oracle Advanced Analytics analytical solutions into BI dashboards and enterprise applications.

By integrating big data management and big data analytics into the same powerful Oracle Database 12c data management platform, Oracle eliminates data movement, reduces total cost of ownership and delivers the fastest way to deliver enterprise-wide predictive analytics solutions and applications.  

(Click HERE to read entire Oracle white paper)

Wednesday Oct 08, 2014

2014 was a very good year for Oracle Advanced Analytics at Oracle Open World 2014

2014 was a very good year for Oracle Advanced Analytics at Oracle Open World 2014.   We had a number of customer, partner and Oracle talks that focused on the Oracle Advanced Analytics Database Option.    See below with links to presentations.  Check back later to OOW Sessions Content Catalog as not all presentations have been uploaded yet.  :-(

Big Data and Predictive Analytics: Fiserv Data Mining Case Study [CON8631]

Moving data mining algorithms to run as native data mining SQL functions eliminates data movement, automates knowledge discovery, and accelerates the transformation of large-scale data to actionable insights from days/weeks to minutes/hours. In this session, Fiserv, a leading global provider of electronic commerce systems for the financial services industry, shares best practices for turning in-database predictive models into actionable policies and illustrates the use of Oracle Data Miner for fraud prevention in online payments. Attendees will learn how businesses that implement predictive analytics in their production processes significantly improve profitability and maximize their ROI.

Developing Relevant Dining Visits with Oracle Advanced Analytics at Olive Garden [CON2898]

Olive Garden, traditionally managing its 830 restaurants nationally, transitioned to a localized approach with the help of predictive analytics. Using k-means clustering and logistic classification algorithms, it divided its stores into five behavioral segments. The analysis leveraged Oracle SQL Developer 4.0 and Oracle R Enterprise 1.3 to evaluate 115 million transactions in just 5 percent the time required by the company’s BI tool. While saving both time and money by making it possible to develop the solution internally, this analysis has informed Olive Garden’s latest remodel campaign and continues to uncover millions in profits by optimizing pricing and menu assortment. This session illustrates how Oracle Advanced Analytics solutions directly affect the bottom line.

A Perfect Storm: Oracle Big Data Science for Enterprise R and SAS Users [CON8331]

With the advent of R and a rich ecosystem of users and developers, a myriad of bloggers, and thousands of packages with functionality ranging from social network analysis and spatial data analysis to empirical finance and phylogenetics, use of R is on a steep uptrend. With new R tools from Oracle, including Oracle R Enterprise, Oracle R Distribution, and Oracle R Advanced Analytics for Hadoop, users can scale and integrate R for their enterprise big data needs. Come to this session to learn about Oracle’s R technologies and what data scientists from smart companies around the world are doing with R.

Extending the Power of In-Database Analytics with Oracle Big Data Appliance [CON2452]

The need for speed could not be greater—not speed of processing but time to market. The problem is driven by the long journey data takes before evolving into insight. Insight, however, is always relative to assumption. In fact, analytics is often seen as a battle between assumption and data. Assumptions can be classified into three types: related to distributions, ratios, and relations. In this session, you will see how the most-valuable business insights can come in the matter of hours, not months, when assumptions are challenged with data. This is made possible by the integration of Oracle Big Data Appliance, enabling transparent access to in-database analytics from the data warehouse and avoiding the traditional long journey of data to insight.

Market Basket Analysis at Dunkin’ Brands [CON6545]

With almost 120 years of franchising experience, Dunkin’ Brands owns two of the world’s most recognized, beloved franchises: Dunkin’ Donuts and Baskin-Robbins. This session describes a market basket analysis solution built from scratch on the Oracle Advanced Analytics platform at Dunkin’ Brands. This solution enables Dunkin’ to look at product affinity and a host of associated sales metrics with a view to improving promotional effectiveness and cross-sell/up-sell to increase customer loyalty. The presentation discusses the business value achieved and technical challenges faced in scaling the solution to Dunkin’ Brands’ transaction volumes, including engineered systems (Oracle Exadata) hardware and parallel processing at the core of the implementation.

Predictive Analytics with Oracle Data Mining [CON8596]

This session presents three case studies related to predictive analytics with the Oracle Data Mining feature of Oracle Advanced Analytics. Service contracts cancellation avoidance with Oracle Data Mining is about predicting the contracts at risk of cancellation at least nine months in advance. Predicting hardware opportunities that have a high likelihood of being won means identifying such opportunities at least four months in advance to provide visibility into suppliers of required materials. Finally, predicting cloud customer churn involves identifying the customers that are not as likely to renew subscriptions as others.

SQL Is the Best Development Language for Big Data [CON7439]

SQL has a long and storied history. From the early 1980s till today, data processing has been dominated by this language. It has changed and evolved greatly over time, gaining features such as analytic windowing functions, model clauses, and row-pattern matching. This session explores what's new in SQL and Oracle Database for exploiting big data. You'll see how to use SQL to efficiently and effectively process data that is not stored directly in Oracle Database.

Advanced Predictive Analytics for Database Developers on Oracle [CON7977]

Traditional database applications use SQL queries to filter, aggregate, and summarize data. This is called descriptive analytics. The next level is predictive analytics, where hidden patterns are discovered to answer questions that give unique insights that cannot be derived with descriptive analytics. Businesses are increasingly using machine learning techniques to perform predictive analytics, which helps them better understand past data, predict future trends, and enable better decision-making. This session discusses how to use machine learning algorithms such as regression, classification, and clustering to solve a few selected business use cases.

What Are They Thinking? With Oracle Application Express and Oracle Data Miner [UGF2861]

Have you ever wanted to add some data science to your Oracle Application Express applications? This session shows you how you can combine predictive analytics from Oracle Data Miner into your Oracle Application Express application to monitor sentiment analysis. Using Oracle Data Miner features, you can build data mining models of your data and apply them to your new data. The presentation uses Twitter feeds from conference events to demonstrate how this data can be fed into your Oracle Application Express application and how you can monitor sentiment with the native SQL and PL/SQL functions of Oracle Data Miner. Oracle Application Express comes with several graphical techniques, and the presentation uses them to create a sentiment dashboard.

Transforming Customer Experience with Big Data and Predictive Analytics [CON8148]

Delivering a high-quality customer experience is essential for long-term profitability and customer retention in the communications industry. Although service providers own a wealth of customer data within their systems, the sheer volume and complexity of the data structures inhibit their ability to extract the full value of the information. To change this situation, service providers are increasingly turning to a new generation of business intelligence tools. This session begins by discussing the key market challenges for business analytics and continues by exploring Oracle’s approach to meeting these challenges, including the use of predictive analytics, big data, and social network analytics.

There are a few others where Oracle Advanced Analytics is included e.g. Retail GBU, Big Data Strategy, etc. but they are typically more broadly focused.  If you search the Content Catalog for “Advanced Analytics” etc. you can find other related presentations that involve OAA.

Hope this helps.  Enjoy!


Sunday May 18, 2014

Oracle Data Miner and Oracle R Enterprise Integration - Watch Demo

Oracle Data Miner and Oracle R Enterprise Integration - Watch Demo

Oracle Advanced Analytics (Database EE) Option turns the database into an enterprise-wide analytical platform that can quickly deliver enterprise-wide predictive analytics and actionable insights.  Oracle Advanced Analytics is comprised of both the Oracle Data Mining SQL data mining functions, Oracle Data Miner, an extension to SQL Developer that exposes the data mining SQL functions for data analysts, and Oracle R Enterprise which integrates the R statistical programming language with SQL.  15 powerful in-database SQL data mining functions, the SQL Developer/Oracle Data Miner workflow GUI and the ability to integrate open source R within an analytical methodology, makes the Oracle Database + Oracle Advanced Analytics Option the ideal platform for building and deploying enterprise-wide predictive analytics applications/solutions.  

In Oracle Data Miner 4.0 we added a new SQL Query node to allow users to insert arbitrary SQL scripts within an ODMr analytical workflow. Additionally, the SQL Query node allows users to leverage registered R scripts to extend Oracle Data Miner's analytical capabilities.  For applications that are mostly OAA/Oracle Data Mining SQL data mining functions based but require additional analytical techniques found in the R community, this is an ideal method for integrating the power of in-database SQL analytical and data mining functions with the flexibility of open source R.  For applications that are built entirely using the R statistical programming language, it may be more practical to stay within the R console or RStudio environments, but for SQL-centric in-database predictive methodologies, this integration is just what might satisfy your needs.

Watch this Oracle Data Miner and Oracle R Enteprise Integration YouTube to see the demo. 

There is an excellent related Oracle Data Miner:  Integrate Oracle R Enterprise Algorithms into workflow using the SQL Query node (pdf, companion files) white paper on this topic that includes examples on the Oracle Technology Network in the Oracle Data Mining pages.  

Tuesday May 06, 2014

Oracle Data Miner 4.0/SQLDEV 4.0 New Features - Watch Demo!

Oracle Data Miner 4.0 New Features 

Oracle Data Miner/SQLDEV 4.0 (for Oracle Database 11g and 12c)

  • New Graph node (box, scatter, bar, histograms)
  • SQL Query node + integration of R scripts
  • Automatic SQL script generation for deployment

Oracle Advanced Analytics 12c New SQL data mining algorithms/enhancements features exposed in Oracle Data Miner 4.0

  • Expectation Maximization Clustering algorithm
  • PCA & Singular Vector Decomposition algorithms
  • Decision Trees can also now mine unstructured data
  • Improved/automated Text Mining, Prediction Details and other algorithm improvements
  • SQL Predictive Queries—automatic build, apply within simple yet powerful SQL query

Monday Feb 03, 2014

How to generate Scatterplot Matrices using R script in Data Miner

Data Miner provides Explorer node that produces descriptive statistical data and histogram graph, which allows analyst to analyze input data columns individually. Often time an analyst is interested in analyzing the relationships among the data columns, so that he can choose the columns that are closely correlated to the target column for model build purpose. To examine relationships among data columns, he can create scatter plots using the Graph node.

For example, an analyst may want to build a regression model that predicts the customer LTV (long term value) using the INSUR_CUST_LTV_SAMPLE demo data. Before building the model, he can create the following workflow with the Graph node to examine the relationships between interested data columns and the LTV target column.

In the Graph node editor, create a scatter plot with an interested data column (X Axis) against the LTV target column (Y Axis). For the demo, let’s create three scatter plots using these data columns: HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT.

Here are the scatter plots generated by the Graph node. As you can see the HOUSE_OWNERSHIP and N_MORTGAGES are quite positively correlated to the LTV target column. However, the MORTGAGE_AMOUNT seems less correlated to the LTV target column.

The problem with the above approach is it is laborious to create scatter plots one by one and you cannot examine relationships among those data columns themselves. To solve the problem, we can create a Scatterplot matrix graph as the following:

This is a 4 x4 scatterplot matrix of data column LTV, HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT. In the top row, you can examine the relationships between HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT against the LTV target column. In the second row, you can examine the relationships between LTV, N_MORTGAGES, and MORTGAGE_AMOUNT against the HOUSE_OWNERSHIP column. In the third and forth rows, you can examine the relationships of other columns against the N_MORTGAGES, and MORTGAGE_AMOUNT respectively.

To generate this scatterplot matrix, we need to invoke the readily available R script RQG$pairs (via the SQL Query node) in the Oracle R Enterprise. Please refer to http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/index.html?ssSourceSiteId=ocomen for Oracle R Enterprise installation.

Let’s create the following workflow with the SQL Query node to invoke the R script. Note: a Sample node may be needed to sample down the data size (e.g. 1000 rows) for large data set before it is used for charting.

Enter the following SQL statement in the SQL Query editor. The rqTableEval is a R SQL function that allows user to invoke R script from the SQL side. The first SELECT statement within the function specifies the input data (LTV, HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT). The second SELECT statement specifies the optional parameter to the R script, where we define the graph title “Scatterplot Matrices”. The output of the function is an XML document with the graph data embedded in it.

cursor(select "INSUR_CUST_LTV_SAMPLE_N$10001"."LTV",
from "INSUR_CUST_LTV_SAMPLE_N$10001"), -- Input Cursor
cursor(select 'Scatterplot Matrices' as MAIN from DUAL), -- Param Cursor
'XML', -- Output Definition
'RQG$pairs' -- R Script

You can see what default R scripts are available in the R Scripts tab. This tab is visible only when the Oracle R Enterprise installation is detected.

Click the button in the toolbar to invoke the R script to produce the Scatterplot matrix below.

You can copy the Scatterplot matrix image to a clipboard or save it to an image file (PNG) for reporting purpose. To do so, right click on the graph to bring up the pop-up menu below.

The Scatterplot matrix is also available in the Data Viewer of the SQL Query node. To open the Data Viewer, select the “View Data” item in the pop-up menu of the node.

The returning XML data is shown in the Data Viewer as shown below. To view the Scatterplot matrix embedded in the data, click on the XML data to bring up the icon in the far right of the cell, and then click on the icon to bring up the viewer.

Tuesday Nov 12, 2013

Oracle Big Data Learning Library

Click on LEARN BY PRODUCT to view all learning resources.

Oracle Big Data Essentials

Attend this Oracle University Course!

Using Oracle NoSQL Database

Attend this Oracle University class!

Oracle and Big Data on OTN

See the latest resource on OTN.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Monday Jul 15, 2013

Oracle Data Miner GUI, part of SQL Developer 4.0 Early Adopter 1 is now available for download on OTN

The NEW Oracle Data Miner GUI, part of SQL Developer 4.0 Early Adopter 1 is now available for download on OTN.  See link to SQL Developer 4.0 EA1.   

The Oracle Data Miner 4.0 New Features are applicable to Oracle Database 11g Release 2 and Oracle Database Release 12c:  See Oracle Data Miner Extension to SQL Developer 4.0 Release Notes for EA1 for additional information  

· Workflow SQL Script Deployment

o Generates SQL scripts to support full deployment of workflow contents

· SQL Query Node

o Integrate SQL queries to transform data or provide a new data source

o Supports the running of R Language Scripts and viewing of R generated data and graphics

· Graph Node

o Generate Line, Scatter, Bar, Histogram and Box Plots

· Model Build Node Improvements

o Node level data usage specification applied to underlying models

o Node level text specifications to govern text transformations

o Displays heuristic rules responsible for excluding predictor columns

o Ability to control the amount of Classification and Regression test results generated

· View Data

o Ability to drill in to view custom objects and nested tables

These new Oracle Data Miner GUI capabilities expose Oracle Database 12c and Oracle Advanced Analytics/Data Mining Release 1 features:

· Predictive Query Nodes

o Predictive results without the need to build models using Analytical Queries

o Refined predictions based on data partitions

· Clustering Node New Algorithm

o Added Expectation Maximization algorithm

· Feature Extraction Node New Algorithms

o Added Singular Value Decomposition and Principal Component Analysis algorithms

· Text Mining Enhancements

o Text transformations integrated as part of Model's Automatic Data Preparation

o Ability to import Build Text node specifications into a Model Build node

· Prediction Result Explanations

o Scoring details that explain predictive result

· Generalized Linear Model New Algorithm Settings

o New algorithm settings provide feature selection and generation

See OAA on OTN pages http://www.oracle.com/technetwork/database/options/advanced-analytics/index.html for more information on Oracle Advanced Analytics.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday May 08, 2013

Oracle Advanced Analytics and Data Mining at the Movies on YouTube - Updated July 12, 201625

Updated July 25, 2016

Periodically, I've recorded a demonstration and/or presentation on Oracle Advanced Analytics and Data Mining and have posted them on YouTube.

Here are links to some of more recent YouTube postings--sort of an Oracle Advanced Analytics and Data Mining at the Movies experience.

  1. Mining Structured and Unstructured Data using Oracle Advanced Analytics (slides)  - Watch on YouTube

  2. New Big Data Analyics using Oracle Advanced Analytics12c and Big Data SQL  - Watch on YouTube
  3. New Oracle Academy Webcast:  Ask the Oracle Experts Fraud &  Anomaly Detection using Oracle Advanced Analytics 12c & Big Data SQL - Watch YouTube
  4. New - Oracle Academy Webcast:  Ask the Oracle Experts Big Data Analytics with Oracle Advanced Analytics - Watch YouTube
  5. Oracle Data Miner and Oracle R Enterprise Integration via SQL Query node - Watch Demo
  6. Oracle Data Miner 4.0 (SQL Developer 4.0 Extension) New Features - Watch Demo
  7. Oracle Business Intelligence Enterprise Edition (OBIEE) SampleAppls Demo featuring integration with Oracle Advanced Analytics/Data Mining
  8. Oracle Big Data Analytics Demo mining remote sensor data from HVACs for better customer service 
  9. In-Database Data Mining for Retail Market Basket Analysis Using Oracle Advanced Analytics
  10. In-Database Data Mining Using Oracle Advanced Analytics for Classification using Insurance Use Case
  11. Fraud and Anomaly Detection using Oracle Advanced Analytics Part 1 Concepts
  12. Fraud and Anomaly Detection using Oracle Advanced Analytics Part 2 Demo
  13. Overview Presentation and Demonstration of Oracle Advanced Analytics Database Option

So.... grab your popcorn and a comfortable chair.  Hope you enjoy!


Oracle Advanced Analytics at the Movies

Friday Jun 08, 2012

New Oracle Advanced Analytics presentation

I recently updated my presentation on Oracle's new Advanced Analytics Option which bundles Oracle Data Mining with Oracle R Enterprise for maximum depth and breadth of data mining, statistics and advanced analytic functions from Oracle.  See New Oracle Advanced Analytics presentation.  

Wednesday Apr 04, 2012

Recorded YouTube-like presentation and "live" demos of Oracle Advanced Analytics/Oracle Data Mining

Ever want to just sit and watch a YouTube-like presentation and "live" demos of Oracle Advanced Analytics/Oracle Data Mining?  Then click here! (plays large MP4 file in a browser)

This 1+ hour long session focuses primarily on the Oracle Data Mining component of the Oracle Advanced Analytics Option and is tied to the Oracle SQL Developer Days virtual and onsite events.   I cover:

  • Big Data + Big Data Analytics
  • Competing on analytics & value proposition
  • What is data mining?
  • Typical use cases
  • Oracle Data Mining high performance in-database SQL based data mining functions
  • Exadata "smart scan" scoring
  • Oracle Data Miner GUI (an Extension that ships with SQL Developer)
  • Oracle Business Intelligence EE + Oracle Data Mining results/predictions in dashboards
  • Applications "powered by Oracle Data Mining for factory installed predictive analytics methodologies
  • Oracle R Enterprise

Please contact charlie.berger@oracle.com should you have any questions.  Hope you enjoy! 

Charlie Berger, Sr. Director of Product Management, Oracle Data Mining & Advanced Analytics, Oracle Corporation

Wednesday Feb 08, 2012

Oracle Announces Availability of Oracle Advanced Analytics for Big Data

Oracle Announces Availability of Oracle Advanced Analytics for Big Data

Oracle Integrates R Statistical Programming Language into Oracle Database 11g

REDWOOD SHORES, Calif. - February 8, 2012

News Facts

  • Oracle today announced the availability of     Oracle Advanced Analytics, a new option for Oracle Database 11g that bundles Oracle R Enterprise together with Oracle Data Mining.
  • Oracle R Enterprise delivers enterprise class performance for users of the R statistical programming language, increasing the scale of data that can be analyzed by orders of magnitude using Oracle Database 11g.
  • R has attracted over two million users since its introduction in 1995, and Oracle R Enterprise dramatically advances capability for R users. Their existing R development skills, tools, and scripts can now also run transparently, and scale against data stored in Oracle Database 11g.
  • Customer testing of Oracle R Enterprise for Big Data analytics on Oracle Exadata has shown up to 100x increase in performance in comparison to their current environment.
  • Oracle Data Mining, now part of Oracle Advanced Analytics, helps enable customers to easily build and deploy predictive analytic applications that help deliver new insights into business performance. Oracle Advanced Analytics, in conjunction with Oracle Big Data Appliance, Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine, delivers the industry’s most integrated and comprehensive platform for Big Data analytics.

Comprehensive In-Database Platform for Advanced Analytics

  • Oracle Advanced Analytics brings analytic algorithms to data stored in Oracle Database 11g and Oracle Exadata as opposed to the traditional approach of extracting data to laptops or specialized servers.
  • With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.
  • By providing direct and controlled access to data stored in Oracle Database 11g, customers can accelerate data analyst productivity while maintaining data security throughout the enterprise.
  • Powered by decades of Oracle Database innovation, Oracle R Enterprise helps enable analysts to run a variety of sophisticated numerical techniques on billion row data sets in a matter of seconds making iterative, speed of thought, and high-quality numerical analysis on Big Data practical.
  • Oracle R Enterprise drastically reduces the time to deploy models by eliminating the need to translate the models to other languages before they can be deployed in production.
  • Oracle R Enterprise integrates the extensive set of Oracle Database data mining algorithms, analytics, and access to Oracle OLAP cubes into the R language for transparent use by R users.
  • Oracle Data Mining provides an extensive set of in-database data mining algorithms that solve a wide range of business problems. These predictive models can be deployed in Oracle Database 11g and use Oracle Exadata Smart Scan to rapidly score huge volumes of data.
  • The tight integration between R, Oracle Database 11g, and Hadoop enables R users to write one R script that can run in three different environments: a laptop running open source R, Hadoop running with Oracle Big Data Connectors, and Oracle Database 11g.
  • Oracle provides single vendor support for the entire Big Data platform spanning the hardware stack, operating system, open source R, Oracle R Enterprise and Oracle Database 11g. To enable easy enterprise-wide Big Data analysis, results from Oracle Advanced Analytics can be viewed from Oracle Business Intelligence Foundation Suite and Oracle Exalytics In-Memory Machine.

Supporting Quotes

  • “Oracle is committed to meeting the challenges of Big Data analytics. By building upon the analytical depth of Oracle SQL, Oracle Data Mining and the R environment, Oracle is delivering a scalable and secure Big Data platform to help our customers solve the toughest analytics problems,” said Andrew Mendelsohn, senior vice president, Oracle Server Technologies.
  • “We work with leading edge customers who rely on us to deliver better BI from their Oracle Databases. The new Oracle R Enterprise functionality allows us to perform deep analytics on Big Data stored in Oracle Databases. By leveraging R and its library of open source contributed CRAN packages combined with the power and scalability of Oracle Database 11g, we can now do that,” said Mark Rittman, co-founder, Rittman Mead.

Supporting Resources

About Oracle

Oracle engineers hardware and software to work together in the cloud and in your data center. For more information about Oracle (NASDAQ: ORCL), visit http://www.oracle.com.


Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Contact Info

Eloy Ontiveros


Joan Levy
Blanc & Otus for Oracle



Everything about Oracle Data Mining, a component of the Oracle Advanced Analytics Option - News, Technical Information, Opinions, Tips & Tricks. All in One Place


« July 2016