Monday Oct 12, 2015

NHS Business Services Authority Gains Better Insight into Data, Identifies circa GBP100 Million (US$156 Million) in Potential Savings in Just Three Months

NHS Business Services Authority Gains Better Insight into Data, Identifies circa GBP100 Million (US$156 Million) in Potential Savings in Just Three Months

The NHS Business Services Authority (NHSBSA) is a special health authority and an arm’s length body of the Department of Health for England. It provides a range of critical central services to NHS organizations, contractors, patients, and the public. Services include managing the NHS Pension schemes in England and Wales, managing payments to primary care dental and pharmacy contractors, and administering the European Health Insurance Card (EHIC).

The NHS budget for 2015/16 is approximately GBP116 billion (US$179 billion) and the total funds administered by the NHSBSA (including those for the NHS Pension schemes) amount to circa GBP32 billion (US$48 billion). The Department of Health asked the NHSBSA to take a proactive role to identify opportunities to reduce costs and eliminate waste. One way to do this was to find better ways to use the vast volumes of data already collected and held within the organization to help reduce fraud and error throughout the health service.

The NHSBSA needed a new, centralized solution that would enable it to gain better value from its data which is spread across a disparate set of IT systems, data, storage, and analytical capabilities. To achieve this, it chose an end-to-end Oracle solution including Oracle Advanced AnalyticsOracle Exadata Database MachineOracle Exalytics In-Memory MachineOracle Endeca Information Discovery, and Oracle Business Intelligence Enterprise Edition.

With this Oracle solution, the NHSBSA established its Data Analytics Learning Laboratory (DALL), investing in both technology and expertise to create insight from its data. Within the first three months of operation, the organization identified circa GBP100 million (US$156 million) in potential savings.

Uncovering Savings in Dentistry

A word from NHS Business Services Authority

  • “Oracle Advanced Analytics’ data mining capabilities and Oracle Exalytics’ performance really impressed us. The overall solution is very fast, and our investment very quickly provided value. We can now do so much more with our data, resulting in significant savings for the NHS as a whole.” – Nina Monckton, Head of Information Services, NHS Business Services Authority

The NHSBSA used analytics to identify significant savings within NHS dental services and find instances of activities which do not demonstrate good value for money.

“With Oracle Advanced Analytics, it is much easier to detect anomalies in behaviors. We used anomaly detection to discover where there might be evidence of inappropriate behavior in dentists’ claims, enabling NHS commissioners to follow up and challenge their activities,” explained Nina Monckton, head of information services, NHSBSA.

Preventing Fraud for European Health Insurance Card

The EHIC is available to all European citizens covered by a statutory social security scheme and entitles them to free healthcare while visiting other European countries. 

During analysis of EHIC data, the NHSBSA discovered commercial addresses being used fraudulently to apply for EHIC cards and uncovered the use of invalid NHS and National Insurance numbers to apply for a card. 

“We used Oracle Exalytics and Oracle Business Intelligence for the EHIC application to improve the front-end validation process, prevent fraud, and blacklist addresses showing suspicious activities,” Monckton said.

Analyzing Billions of Records in Minutes

The NHSBSA receives data relating to more than one billion prescription items dispensed in primary care settings each year. Previously, the NHSBSA did not have the computing power to analyze this data at transaction level.

The NHSBSA can now analyze billions of records at one time, and by analyzing much larger sets of patient data, the NHSBSA can provide insight that is helping to improve standards of care throughout the health service.

“Previously, our information analysts did not have the ability to directly query data as it was mainly held in live operational systems. Now that we are able to transfer data to our Exadata environment, we have dramatically improved our ability to deliver value from our data,” Monckton said.

Analyzing Unstructured Text to Measure Satisfaction

Improving Data Matching To Save Millions of Dollars

In England, some people are entitled to free medical prescriptions or dental treatment from the NHS. The NHSBSA works with the Department of Work and Pensions (DWP) to establish that those patients declaring that they are exempt from a charge for dental treatment and/or medical prescriptions are claiming correctly. Using Oracle Exalytics to compare datasets, the NHSBSA reduced the rate of non-matching records for dentistry from 15% to just 5%.

The Role of Data Governance

Data is now moving to the heart of all NHSBSA programs. As a result of the organization’s new analytics capability, teams have a better understanding of what they can do with the data and are more careful about what data they collect. 

“We now know that if we collect the right data at the start of a program, we can measure what is working down the line. We are starting to change the culture of the organization around our data governance. There has been a massive shift. Data is now central to all our new programs, and data governance is at the heart of everything we do,” Monckton said.

Using the Data Analytics Learning Laboratory to Achieve Strategic Goals

The NHSBSA’s data analytics investment is helping the organization to achieve its 5 year strategic goals, which include helping to save GBP1 billion (US$1.56 billion) for NHS patients, reducing unit costs by 50%, improving service and delivering great results for customers, and deriving insight from data to drive change.

“With our newly established Data Lab in place, we can add even more value to the NHS. I cannot begin to describe how significant that has been. This project is really helping us to achieve our strategic goals. In addition, we are working in a different way now and it has even helped with how people interact and function in the workplace.

“We’ve had a very positive response, and our chief executive is extremely impressed with our achievements and the results we have shown so far. As a result, management is recommending that our suppliers and partners come to see what we are doing to learn from our experiences,” Monckton said.

Over the next six months, the DALL team has a large number of analytics projects in the pipeline and is looking to help other areas of the business to better leverage their data. The organization will focus on how it can use Oracle Business Intelligence Enterprise Edition with business users. In addition, the NHSBSA is investigating how it might share data and its analytical ability with other government organizations to drive further value from its investment.


  • Use new insight gathered from data to help identify cost savings and meet NHSBSA strategic goals
  • Identify and prevent healthcare fraud and benefit eligibility errors to save costs
  • Leverage existing data to transform business and productivity


Oracle Product and Services

  • Identified up to GBP100 million (US$156 million) that could potentially be saved across the NHS through benefit fraud and error reduction, by deploying new analytics infrastructure
  • Identified and implemented changes to prevent fraudulent European Health Insurance Card (EHIC) applications
  • Used data matching to identify savings that can be made through the recovery of money from patients claiming exemption from charges for dental treatment or prescriptions when not eligible to do so
  • Used anomaly detection to uncover fraudulent activity where some dentists split a single course of treatment into multiple parts and presented claims for multiple treatments
  • Analyzed unstructured text to measure employee satisfaction in more detail and found a direct link between those who felt less engaged at work and those more likely to take time off sick
  • Analyzed billions of records at one time to measure longer-term patient journeys and to analyze drug prescribing patterns to improve patient care
  • Established a new Data Analytics Learning Laboratory (DALL) that uses data and analytics to drive action and significant savings for the NHS
  • Implemented Oracle Advanced Analytics, Oracle Exadata Database Machine, Oracle Exalytics In-Memory Machine, Oracle Endeca Information Discovery, and Oracle Business Intelligence Enterprise Edition to deliver fast analysis and data mining for NHS and wider government departments

Why Oracle

“We chose Oracle because the solution could cope with very large data volumes running into billions of rows and could scale as volumes increase. In addition, the Oracle solution required no IT team support to run the queries, which enables our team of data analysts to be self-sufficient. Oracle Exalytics’ in-memory capability gave us the speed we required, and Oracle’s engineered systems accelerated deployment and reduced risk.

“Working with Oracle has been a very positive experience. The team has been incredibly responsive and provided a number of experts to help us get up and running as quickly as possible. With one vendor providing the whole solution, it’s very easy for us. If we need help, we know where to go,” Monckton said.

Implementation Process

Oracle ran a proof of concept (POC) to show the speed and capability of the proposed end-to-end solution. The POC used publically available data sets for NHS prescription data. It covered 50 million prescribed items, 300 million records, and six months of data. The team concentrated on finding anomalies in the data and carrying out further analysis to understand them before presenting the findings in a clear and straightforward way.

Following the POC, Oracle worked with NHSBSA and its data center partner, Capita, to complete the implementation. During implementation, Oracle provided the NHSBSA with access to a virtual environment. This enabled the team to get some experience with the tools before completing the implementation. As such, NHSBSA was familiar and confident with using the new analytics tools from day one, saving considerable time and gaining immediate value.

NHSBSA identified which data it should use for analysis and transferred it across to its Oracle Exadata environment. To date it has transferred more than 15 billion rows of data into Oracle Exadata. The prescription services database with 14 billion rows of data is the largest exported data source using 400 gigabytes. The export took 10 hours to complete with Oracle as the source database. 

Advice from NHSBSA

  • Have a clear plan for the first six months before you begin your implementation
  • Ensure you have buy-in from key stakeholders
  • Choose easy areas to start with, so you can demonstrate positive results quickly and prove the value of the solution to others
  • Build knowledge within your team through training and Oracle events; this helps staff to think differently about the possibilities of using data
  • Get help from the experts: talk to your existing suppliers, go to analytics events, and talk to other organizations who have implemented analytics
  • It’s never too early to think about data governance and data quality: recruit a data standards manager to create data governance policies and identify data leads around the business


Friday Sep 25, 2015

Oracle Advanced Analytics at Oracle Open World 2015

While there are a lot of OOW talks that include the work “analytics” or “big data”, this is my short list of sessions, training and demos that primarily focus on Oracle Advanced Analytics. Hope to see you there!


Oracle Advanced Analytics at OOW'15 Highlights

Big Data Analytics with Oracle Advanced Analytics12c and Big Data SQL &
Fiserv Case Study: Fraud Detection in Online Payments [CON8743]

Tuesday, Oct 27, 5:15 p.m. | Moscone South—307

· Charles Berger, Sr. Director of Product Management, Advanced Analytics and Data Mining, Oracle

· Miguel M Barrera, Director of Risk Analytics and Strategy

· Julia Minkowski, Risk Analytics Manager

Oracle Advanced Analytics 12c delivers parallelized in-database implementations of data mining algorithms and integration with R. Data analysts use Oracle Data Miner GUI and R to build and evaluate predictive models and leverage R packages and graphs. Application developers deploy Oracle Advanced Analytics models using SQL data mining functions and R. Oracle extends Oracle Database to an analytical platform that mines more data and data types, eliminates data movement, and preserves security to automatically detect patterns, anticipate customer behavior, and deliver actionable insights. Oracle Big Data SQL adds new big data sources and Oracle R Advanced Analytics for Hadoop provides algorithms that run on Hadoop. 

Fiserv manage risk for $30B+ in transfers, servicing 2,500+ US financial institutions, including 27 of the top 30 banks and prevents $200M in fraud losses every year.  When dealing with potential fraud, reaction needs to be fast.  Fiserv describes their use of Oracle Advanced Analytics for fraud prevention in online payments and shares their best practices and results from turning predictive models into actionable intelligence and next generation strategies for risk mitigation.  
Conference Session

OAA Demo Pod (#3581—Big Data Predictive Analytics with Oracle Advanced Analytics, R, and Oracle Big Data SQL   Moscone South

The Oracle Advanced Analytics database option embeds powerful data mining algorithms in Oracle Database’s SQL kernel and adds integration with R for solving big data problems such as predicting customer behavior, anticipating churn, detecting fraud, and performing market basket analysis. Data analysts work directly with database data, using the Oracle Data Miner workflow GUI (SQL Developer 4.1 ext.), SQL, or R languages and can extend Oracle Advanced Analytics’ functionally with R graphics and CRAN packages. Oracle Big Data SQL enables Oracle Advanced Analytics models to run on Oracle Big Data Appliance. Oracle R Advanced Analytics for Hadoop provides a powerful R interface over Hadoop and Spark with parallel-distributed predictive algorithms. Learn more in this demo.

Real Business Value from Big Data and Advanced Analytics [UGF4519]

Sunday, Oct 25, 3:30 p.m. | Moscone South—301

· Antony Heljula, Technical Director, Peak Indicators Limited

· Brendan Tierney, Principal Consultant, Oralytics

Attend this session to hear real case studies where big data and advanced analytics have delivered significant return on investment to a variety of Oracle customers. These solutions can pay for themselves within one year. Customer case studies include predicting which employees are likely to leave within the next 12 months, predicting which sales outlets are likely to suffer from out-of-stock products, predicting sales based on the weather forecast, and predicting which students are likely to withdraw early from their courses. A live demonstration illustrates the high-level process for implementing predictive business intelligence (BI) and its best practices.  User Group Forum Session

Customer Panel: Big Data and Data Warehousing [CON8741]

Wednesday, Oct 28, 4:15 p.m. | Moscone South—301

· Craig Fryar, Head of Wargaming Business Intelligence,

· Manuel Martin Marquez, Senior Research Fellow and Data Scientist, Cern Organisation Européenne Pour La Recherche Nucléaire

· Jake Ruttenburg, Senior Manager, Digital Analytics, Starbucks

· Chris Wones, Chief Enterprise Architect, 8451

· Reiner Zimmermann, Senior Director, DW & Big Data Global Leaders Program, Oracle

In this session, hear how customers around the world are solving cutting-edge analytical business problems using Oracle Data Warehouse and big data technology. Understand the benefits of using these technologies together, and how software and hardware combined can save money and increase productivity. Learn how these customers are using Oracle Big Data Appliance, Oracle Exadata, Oracle Exalytics, Oracle Database In-Memory 12c, or Oracle Analytics to drive their business, make the right decisions, and find hidden information. The conversation is wide-ranging, with customer panelists from a variety of industries discussing business benefits, technical architectures, implementation of best practices, and future directions.  Conference Session

End-to-End Analytics Across Big Data and Data Warehouse for Data Monetization [CON3296]

Monday, Oct 26, 4:00 p.m. | Moscone West—2022

· Satya Bhamidipati, Senior Principal Advanced Analytics Market Dev, Business Analytics Product Group, Oracle

· Gokula Mishra, VP, Big Data & Advanced Analytics, Oracle

Organizations have used data warehouses to manage structured and operational data, which provides business analysts with the ability to analyze key internal data and spot trends. However, the explosion of newer data sources (big data) not only challenges the role of the traditional data warehouse in analyzing data from these diverse sources but also exposes limitations posed by traditional software and hardware platforms. This newer data can be combined with the data in the data warehouse and analyzed without creating another data silo and creating a hybrid data analytics structure. This presentation discusses the data and analytics platform architecture that enables this data monetization and presents various industry use cases.  Conference Session

Building Predictive Models for Identifying and Preventing Tax Fraud [CON3294]

Wednesday, Oct 28, 9:00 a.m. | Park Central—Concordia

· Brian Bequette, Managing Partner, TPS

· Satya Bhamidipati, Senior Principal Advanced Analytics Market Dev, Business Analytics Product Group, Oracle

According to a TIGTA Audit Report issued in February 2013, in 2012 alone, the IRS identified almost 642,000 incidents of identity theft affecting tax administration, a 38 percent increase since 2010. And this number continues to increase. Tax Processing Systems (TPS) consultants have focused on fraud detection and developed innovative solutions and proprietary algorithms for detecting fraud. In 2012, TPS formed a partnership with Oracle and has adapted its cloud-based methodologies and algorithms for use on the Oracle technology stack. Together, TPS and Oracle have created an end-to-end fraud detection solution that is effective, efficient, and accurate. This presentation focuses on the technology and the algorithms they have developed to detect fraud.  Conference Session

Oracle University Pre-OOW Course – Sunday, Oct. 25th

Using Data Mining Techniques for Predictive Analysis Course, Sunday October 25th

This session teaches students the basic concepts of data mining and how to leverage the predictive analytical power of data mining with Oracle Database by using Oracle Data Miner 12c. Students will learn how to explore the data graphically, build and evaluate multiple data models, apply data mining models to new data, and deploy data mining's predications and insights throughout the enterprise. All this can be performed on the data in Oracle Database on a real-time basis by using Oracle Data Miner SQL APIs. As the data, models, and results remain in Oracle Database, data movement is eliminated, security is maximized, and information latency is minimized.
See Oracle University at Oracle OpenWorld and Make the Most of Your Oracle OpenWorld and JavaOne Experience with Preconference Training by Oracle Experts

When: Sunday, October 25, 2015, 9 a.m.-4 p.m., with a one-hour lunch break
Where: Golden Gate University, 536 Mission Street, San Francisco, CA 94105 (three blocks from Moscone Center)
Cost: US$850 for a full day of training (cost includes light refreshments and a boxed lunch)

Instructor: Ashwin Agarwal… Read full bio

Target Audience: Data scientists, application developers, and data analysts

Course Objectives:

  • Understand the basic concepts and describe the primary terminology of data mining
  • Understand the steps associated with a data mining process
  • Use Oracle Data Miner 12c to perform data mining
  • Understand the options for deploying data mining predictive results

Course Topics:

  • Understanding the Data Mining Concepts
  • Understanding the Benefits of Predictive Analysis
  • Understanding Data Mining Tasks
  • Key Steps of a Data Mining Process (Includes Demo)
  • Using Oracle Data Miner to Build, Evaluate, and Apply Multiple Data Mining Models Includes Demo)
  • Using Data Mining Predictions and Insights to Address Various Business Problems (Includes Demo)
  • Predicting Individual Behavior (Includes Demo)
  • Predicting Values (Includes Demo)
  • Finding Co-Occurring Events (Includes Demo)
  • Detecting Anomalies (Includes Demo)
  • Learning How to Deploy Data Mining Results for Real-Time Access by End Users

Prerequisites: A working knowledge of the SQL language and Oracle Database design and administration

Also, on the Big Data + Analytics related products OTN pages, there is a “Must See” Program Guide. Clicking on the .pdf link you’ll see the full list.

Friday Aug 07, 2015

Oracle Advanced Analytics Oracle University (OU) Classes in Cambridge, MA. September 28-Oct. 1, 2015

Oracle University has rescheduled their 2 day back to back Oracle Advanced Analytics OU Classes in Cambridge, MA.   Please help spread the word. 

Oracle Advanced Analytics combo-course (ODM + ORE) training

This is great opportunity for big data analytics customers and partners to learn hands on about using Oracle Advanced Analytics.  Vlamis, authorized OU instructor(s), will be teaching the OAA/ODM & OAA/ORE courses again and have been a great and knowledgeable OAA training and implementation partner. The courses are also during the week of Predictive Analytics World in Boston (Oracle will be exhibiting and speaking) so perhaps a good time for customers to come to Boston, perhaps use some OU credits, learn some new skills and focus on Oracle’s predictive analytics. 

Anyone (customers and Oracle Employees) can register through us at or via their normal OU connections. They should be able to utilize OU training credits for either course.  Oracle Employees should register through the Employee Self Service from Self Service Applications

Please forward to any appropriate Oracle Advanced Analytics customers and partners.  Thanks!


Friday Jul 24, 2015

2015 BIWA SIG Virtual Conference - Two Days of "Live" Talks by Experts - FREE

2015 BIWA SIG Virtual Conference

July 30-31, 2015 9:00 a.m. - 1:00 p.m. CDT

Join us for two full days where you will hear about the latest Business Intelligence trends. 

Day One:

  • 9:00 a.m. - 10:00 a.m.: What’s new in Oracle EPM and BI Infrastructure - Eric Helmer, ADI Strategies

Hyperion EPM abd BI Fusion edition is a dramatic change under the covers. Corporations must consider more globalapproaches to infrastructure to maintain availability and performance while reducing footprint and cost. Technologies such as Exalytics, Oracle virtualization, cloud computing, software as a service, etc and open source operating systems (Linux) are more commonplace. Join Oracle Are Director Eric Helmer as he covers what’s new, what’s supported, and what options you have when implementing your EPM/BI project.

  • 10:00 a.m. - 11:00 a.m.Italian Ministry of Labor & Social Policy -- A Journey to Digital Government - Nicola Sandoli, ICONSULTING

The Italian Ministry of Labor and Social Policy (MLPS) is a branch of the Italian government responsible for all labormatters, including employment policies, promotions, worker protection, and social security. In its evolution towards a digital government, MLPS is streamlining and simplifying its administrative processes. MLPS has embarked on a data-driven journey to redefine business models and interactions with citizens – and optimize and transform government services. MLPS is focusing on four areas: - Information delivery: transitioning its data warehouse platform from reporting to centralizing and certifying data - Business Intelligence: monitoring activities, web publishing, and analyzing socio–political impact - Web analytics and semantic intelligence: interacting more efficiently with citizens - Job-hunting online guidance services: real time answers to young people looking for jobs MLPS is using a wide range of Oracle technologies to manage large amounts of diverse data, and apply advanced analytics, including - Oracle Exalytics for daily updates of 5TB of data - Oracle Spatial and Graph and MapViewer 11g for location intelligence capabilities - Oracle Business Intelligence for desktop and mobile reporting - Oracle Endeca Information Discovery for web analytics, data discovery, and data analysis using social and semantic intelligence - Oracle Real-Time Decisions - Oracle Service-Oriented Architecture Suite: central point for accessing and managing information made available through the Ministry web portal Cliclavoro Learn more about MLPS and its innovative platform that is delivering better information and services to their constituents.

  • 11:00 a.m. - 12:00 p.m.Exadata:  Elastic Configurations and IaaS – Private Cloud - Amit Kanda, Oracle

Customers are faced with challenges in their business, which include taking real time data driven decisions and  reducing costs.  Exadata’s extreme performance combined with Database In-Memory answer the real time data driven decisions. Elastic configurations and an updated subscription model (IaaS – Private Cloud) for Exadata  hardware and software accompanied the launch of Exadata X5–2.  This presentation will describe these updates and how customers can start small with Exadata and grow Exadata with their business – making it easier to reach business objectives.

  • 12:00 p.m. - 1:00 p.m.The State of Internet of Things (IoT) - Shyam Varan Nath, GE

The Internet of Things or IoT is poised to have a tremendous amount of impact around us. This session will look at  the industry landscape of IoT. The different flavors of IoT will be discussed with use cases from the consumer,  commercial and industrial sectors. Learn about the edge and cloud computing platforms to power the IoT solutions.  Finally, walk-thru of use-cases that show how machine/sensor data is being monetized through analytics. Such use  cases will span Aviation and other industries.

Day Two:

  • 9:00 a.m. - 10:00 a.m.: Big Data Analytics with Oracle Advanced Analytics 12c and Big Data SQL - Charlie Berger, Oracle

Oracle Advanced Analytics 12c, delivers parallelized in-database implementations of data mining algorithms andintegration with R. Data analysts use Oracle Data Miner GUI and R to build and evaluate predictive models and leverage R packages and graphs. Application developers deploy OAA models using SQL data mining functions and R. Oracle extends the Database to an analytical platform that mines more data and more data types, eliminates data movement and preserves security to automatically detect patterns and anticipate customer behavior and deliver actionable insights. Oracle Big Data SQL adds new big data sources and ORAAH provides algorithms that run on Hadoop. Come learn what’s new, best practices, and hear customer examples.

  • 10:00 a.m. - 11:00 a.m.: Graph Data Management and Analytics for Big DataBill Beauregard, Oracle & Zhe Wu, Oracle

The newest Oracle big data product, Oracle Big Data Spatial and Graph, offers a set of spatial analytic services, and a graph database with rich graph analytics that support big data workloads on Apache Hadoop and NoSQL technologies. Oracle is applying over a decade of expertise with spatial and graph analytic technologies to big data architectures. Graphs are an important data model for big data systems. Property graphs can be used for discovery, for instance, to discover underlying communities and influencers within a social graph, relationships and connections in cyber security networks, and to generate recommendations based on interests, profiles, and past behaviors. Oracle Big Data Spatial and Graph provides optimized storage, search and querying in Oracle NoSQL Database and Apache HBase for distributed property graphs. It offers 35 built-in, in-memory, parallel property graph analytic functions. We will discuss use cases, features, architecture, and show a demo. Learn how developers and data scientists can manage their most challenging graph data processing in a single enterprise-class Big Data platform.

  • 11:00 a.m. - 12:00 p.m.Why Oracle Database In-Memory?  Use Cases and Overview - Andy Rivenes, Oracle

Oracle recently announced the availability of the Oracle Database In-Memory option, a memory-optimized database technology that transparently adds real-time analytics to applications. Because the In-Memory option is 100% compatible with existing Oracle Database applications, it’s easy to integrate it into your environment and to begin reaping the benefits. But how do you get started with it? What do you need to know to take full advantage of this new functionality? This session will give an overview of what Oracle Database In-Memory is and then discuss some use cases to highlight how it can be used.

| Register Here |

Wednesday Apr 22, 2015

OpenWorld 2015 Call for Proposals Extended to Wed, May 6th, 11:59 p.m

OpenWorld 2015 Call for Proposals Extended to Wed, May 6th, 11:59 p.m Submit your Oracle Advanced Analytics stories now

If you’re an Oracle technology expert, conference attendees want to hear it straight from you. So don’t wait—proposals must be submitted by April 29.

Wanted: Outstanding Oracle Experts

The Oracle OpenWorld 2015 Call for Proposals is now open. Attendees at the conference are eager to hear from experts on Oracle business and technology. They’re looking for insights and improvements they can put to use in their own jobs: exciting innovations, strategies to modernize their business, different or easier ways to implement, unique use cases, lessons learned, the best of best practices.

If you’ve got something special to share with other Oracle users and technologists, they want to hear from you, and so do we. Submit your proposal now for this opportunity to present at Oracle OpenWorld, the most important Oracle technology and business conference of the year.

We recommend you take the time to review the General Information, Submission Information, Content Program Policies, and Tips and Guidelines pages before you begin. We look forward to your submissions.

Submit Your Proposal

By submitting a session for consideration, you authorize Oracle to promote, publish, display, and disseminate the content submitted to Oracle, including your name and likeness, for use associated with the Oracle OpenWorld and JavaOne San Francisco 2015 conferences. Press, analysts, bloggers and social media users may be in attendance at OpenWorld or JavaOne sessions.

General Information

  • Conference location: San Francisco, California, USA
  • Dates: Sunday, October 25 to Thursday, October 29, 2015
  • Website: Oracle OpenWorld

Key Dates for 2015

Deliverables Due Dates
Call for Proposals—Open Wednesday, March 25
Call for Proposals—Closed Wednesday, April 29, 11:59 p.m. PDT
Notifications for accepted and declined submissions sent Mid-June

Contact us

  • For questions regarding the Call for Proposals, send an e-mail to
  • For technical questions about the submission tool or issues with submitting your proposal, send an e-mail to
  • Oracle employee submitters should contact the appropriate Oracle track leads before submitting. To view a list of track leads, click here.

Monday Dec 15, 2014

Use Oracle Data Miner to Perform Sentiment Analysis inside Database using Twitter Data Demo

Sentiment analysis has been a hot topic recently; sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials.  Social media websites are good source of people sentiments.  Companies have been using social networking sites to make new product announcements, promote their products, collect product reviews and user feedback, interact with their customers, etc.  It is important for companies to sense customer sentiments toward their products, so they can react accordingly to benefit from customers’ opinion.

In this blog, we will show you how to use Data Miner to perform some basic sentiment analysis (based on text analytics) using Twitter data.  The demo data was downloaded from the developer API console page of the Twitter website.  The data itself originated from the Oracle Twitter page, and it contains about a thousand tweets posted in the past six months (May to Oct 2014).  We will determine the sentiments (highly favored, moderately favored, and less favored) of tweets based on their favorite counts, and assign the sentiment to each tweet.  We then build classification models using these tweets along with their assigned sentiments.  The goal is to predict how well a new tweet will be received by customers.  This may help marketing department to better craft a tweet before it is posted.

The demo (click here to download demo twitter data and workflow) will use the newly added JSON Query node in the Data Miner 4.1 to import the twitter data; please review the “How to import JSON data to Data Miner for Mining” blog entry in previous post.

Workflow for Sentiment Analysis

The following workflow shows the process we use to prepare the twitter data, determine the sentiments of tweets, and build classification models on the data.

The following describes the nodes used in the above workflow:

  • Data Source (TWITTER_LARGE)
    • Select the demo Twitter data source.  The sample Twitter data is attached with this blog.
  • JSON Query (JSON Query)
    • Select the required JSON attributes used for analysis; we only use the “id”, “text”, and “favorite_count” attributes.  The “text” attribute contains the tweet, and the “favorite_count” attribute indicates how many times the tweet has been favorited.
  • SQL Query (Cleanse Tweets)
    • Remove shorten URLs and punctuations within tweets because these data contain no predictive information.
  • Filter Rows (Filter Rows)
    • Remove retweeted tweets because these are duplicate tweets.
  • Transform (Transform)
    • Perform quantile bin of the “favorite_count” data into three quantiles; each quantile represent a sentiment.  The top quantile represents “highly favored” sentiment, the middle quantile represents “moderately favored” sentiment, and the bottom quantile represents “less favored” sentiment.
  • SQL Query (Recode Sentiment)
    • Assign quantiles as determined sentiments to tweets.
  • Create Table (OUTPUT_4_29)
    • Persist the data to a table for classification model build (optional).
  • Classification (Class Build)
    • Build classification models to predict customer sentiment toward a new tweet (how much will customer like this new tweet?).

Data Source Node (TWITTER_LARGE)

Select the JSON_DATA in the TWITTER_LARGE table.  The JSON_DATA contains about a thousand tweets to be used for sentiment analysis.

JSON Query Node (JSON Query)

Use the new JSON Query node to select the following JSON attributes.  This node projects the JSON data to relational data format, so that it can be consumed within the workflow process.

SQL Query Node (Cleanse Tweets)

Use the REGEXP_REPLACE function to remove numbers, punctuations, and shorten URLs inside tweets because these data are considered noises and do not provide any predictive information.  Notice we do not treat hash tags inside tweets specially; these tags are treated as regular words.

We specify the number, punctuation, and URL patterns in regular expression syntax and use the database function REGEXP_REPLACE to replace these patterns inside all tweets with empty spaces.

REGEXP_REPLACE("JSON Query_N$10055"."TWEET", '([[:digit:]*]|[[:punct:]*]|(http[s]?://(.*?)(\s|$)))', '', 1, 0) "TWEETS",
"JSON Query_N$10055"."FAVORITE_COUNT",
"JSON Query_N$10055"."ID"
"JSON Query_N$10055"

Filter Rows Node (Filter Rows)

Remove retweeted tweets because these are duplicate tweets.  Usually, retweeted tweets start with a “RT” abbreviate, so we specify the following row filter condition to filter out those tweets.

Transform Node (Transform)

Use the Transform node to perform quantile bin of the “favorite_count” data into three quantiles; each quantile represent a sentiment.  For simplicity, we just bin the count into three quantiles without applying any special treatment first.

SQL Query Node (Recode Sentiment)

Assign quantiles as determined sentiments to tweets; top quantile represents “highly favored” sentiment, the middle quantile represents “moderately favored” sentiment, and the bottom quantile represents “less favored”.  These sentiments become target classes for the classification model build.

Classification Node (Class Build)

Build Classification models using the sentiment as target and tweet id as case id.

Since the TWEETS column contains the textual tweets, so we change the mining type to Text Custom.

Enable the Stemming option for text processing.

Compare Test Results

After the model build completes successfully, open the test viewer to compare model test results, the SVM model seems to produce the best prediction for the “highly favored” sentiment (57% correct prediction).

Moreover, the SVM model has better lift result than other models, so we will use this model for scoring.

Sentiment Prediction (Scoring)

Let’s score this tweet “this is a boring tweet!” using the SVM model.

As expected, this tweet receives a “less favored” prediction.

How about this tweet “larry is doing a data mining demo now!” ?

Not surprisingly, this tweet receives a “highly favored” prediction.

Last but not least, let’s see the sentiment prediction for the title of this blog

Not bad it gets a “highly favored” prediction, so it seems this title will be well received by audience.


The best SVM model only produces 57% accuracy for the “highly favored” sentiment prediction, but it is reasonably better than random guess.  For a larger sample of tweet data, the model accuracy could be improved.  With the new JSON Query node, it enables us to perform data mining on JSON data which is the most popular data format produced by prominent social networking sites.

Monday Dec 08, 2014

How to import JSON data to Data Miner for Mining

JSON is a popular lightweight data structure used by Big Data. Increasingly, a lot of data produced by Big Data are in JSON format. For example, web logs generated in the middle tier web servers are likely in JSON format. NoSQL database vendors have chosen JSON as their primary data representation. Moreover, the JSON format is widely used in the RESTful style Web services responses generated by most popular social media websites like Facebook, Twitter, LinkedIn, etc. This JSON data could potentially contain wealth of information that is valuable for business use. So it is important that we can bring this data over to Data Miner for analysis and mining purposes.

Oracle database provides ability to store and query JSON data. To take advantage of the database JSON support, the upcoming Data Miner 4.1 added a new JSON Query node that allows users to query JSON data as relational format. In additional, the current Data Source node and Create Table node are enhanced to allow users to specify JSON data in the input data source.

In this blog, I will show you how to specify a JSON data in the input data source and use JSON Query node to selectively query desirable attributes and project the result in relational format. Once the data is in relational format, users can treat it as a normal relational data source and start analyzing and mining it immediately. The Data Miner repository installation installs a sample JSON dataset ODMR_SALES_JSON_DATA, which I will be using it here. However, Oracle Big Data SQL supports queries against vast amounts of big data stored in multiple data sources, including Hadoop. Users can view and analyze data from various data stores together, as if it were all stored in an Oracle database.

Specify JSON Data

The Data Source node and Create Table nodes are enhanced to allow users to specify the JSON data type in the input data source.

Data Source Node

For this demo, we will focus on the Data Source node. To specify JSON data, create a new workflow with a Data Source node. In the Define Data Source wizard, select the ODMR_SALES_JSON_DATA table. Notice there is only one column (JSON_DATA) in this table, which contains the JSON data.

Click Next to go to the next step where it shows the JSON_DATA is selected with the JSON(CLOB) data type. The JSON prefix indicates the data stored is in JSON format; the CLOB is the original data type. The JSON_DATA column is defined with the new “IS JSON” constraint, which indicates only valid JSON document can be stored there. The UI can detect this constraint and automatically select the column as JSON type. If there was not a “IS JSON” constraint defined, the column would be shown with a CLOB data type. To manually designate a column as a JSON type, click on the data type itself to bring up a in-place dropdown where it lists the original data type (e.g. CLOB) and a corresponding JSON type (e.g. JSON(CLOB)), so just select the JSON type. Note: only the following data types can be set to JSON type: VARCHAR2, CLOB, BLOB, RAW, NCLOB, and NVARCHAR2.

Click Finish and run the node now.

Once the node is run successfully, open the editor to examine the generated JSON schema.

Notice the message “System Generated Data Guide is available” at the bottom of the Selected Attributes listbox. What happens here is when the Data Source node is run, it parsed the JSON documents to produce a schema that represents the document structure. Here is what the schema looks like:











































The JSON Path expression syntax and associated data type info (OBJECT, ARRAY, NUMBER, STRING, BOOLEAN, NULL) are used to represent JSON document structure. We will refer to this JSON schema as Data Guide throughout the product.

Before we look at the Data Guide in the UI, let’s look at the settings that can affect how it is generated. Click the “JSON Settings…” button to open the JSON Parsing Settings dialog.

The settings are described below:

· Generate Data Guide if necessary

o Generate a Data Guide if it is not already generated in parent node.

· Sampling

o Sample JSON documents for Data Guide generation.

· Max. number of documents

o Specify maximum number of JSON documents to be parsed for Data Guide generation.

· Limit Document Values to Process

o Sample JSON document values for Data Guide generation.

· Max. number per document

o Specify maximum number of JSON document scalar values (e.g. NUMBER, STRING, BOOLEAN, NULL) per document to be parsed for Data Guide generation.

The sampling option is enabled by default to prevent long-running parsing of JSON documents; parsing could take a while for large number of documents. However, users may supply a Data Guide (Import from File) or reuse an existing Data Guide (Import from Workflow) if compatible Data Guide is available.

Now let’s look at the Data Guide, go back to the Edit Data Source Node dialog, select the JSON_DATA column and click the above to open the Edit Data Guide dialog. The dialog shows the JSON structure in a hierarchical tree view with data type information. The “Number of Values Processed” shows the total number of JSON scalar values was parsed to produce the Data Guide.

Users can control whether to enable Data Guide generation or import a compatible Data Guide via the menu under the icon.

The menu options are described below:

· Default

o Use the “Generate Data Guide if necessary” setting found in the JSON Parsing Setting dialog (see above).

· On

o Always generate a Data Guide.

· Off

o Do not generate a Data Guide.

· Import From Workflow

o Import a compatible Data Guide from a workflow node (e.g. Data Source, Create Table). The option will be set to Off after the import (disable Data Guide generation).

· Import From File

o Import a compatible Data Guide from a file. The option will be set to Off after the import (disable Data Guide generation).

Users can also export the current Data Guide to a file via the icon.

Select JSON Data

In Data Miner 4.1, a new JSON Query node is added to allow users to selectively bring over desirable JSON attributes as relational format.

JSON Query Node

The JSON Query node is added to the Transforms group of the Workflow.

Let’s create a JSON Query node and connect the Data Source node to it.

Double click the JSON Query node to open the editor. The editor consists of four tabs, and these tabs are described as followings:


The Column dropdown lists all available columns in the data source where JSON structure (Data Guide) is found. It consists of the following two sub tabs:

o Structure

o Show the JSON structure of the selected column in a hierarchical tree view.

o Data

o Show sample of JSON documents found in the selected column. By default it displays first 2,000 characters (including spaces) of the documents. Users can change the sample size (max. 50,000 chars) and run the query to see more of the documents.

· Addition output

o Allow users to select any non-JSON columns in the data source as additional output columns.

· Aggregation

o Allow users to define aggregations of JSON attributes.

· Preview

o Output Columns

o Show columns in the generated relational output.

o Output Data

o Show data in the generated relational output.


Let’s select some JSON attributes to bring over. Skip the SALES attributes because we want to define aggregations for these attributes (QUANTITY_SOLD and AMOUNT_SOLD).

To peek at the JSON documents, go to the Data tab. You can change the Sample Size to look at more JSON data. Also, you can search for specific data within the displayed documents by using the search control.

Addition Output Tab

If you have any non-JSON columns in the data source that you want to carry over for output, you can select those columns here.

Aggregate Tab

Let’s define aggregations (use SUM function) for QUANTITY_SOLD and AMOUNT_SOLD attributes (within the SALES array) for each customer group (group by CUST_ID).

Click the icon in the top toolbar to open the Edit Group By dialog, where you can select the CUST_ID as the Group-By attribute. Notice the Group-By attribute can consists of multiple attributes.

Click OK to return to the Aggregate tab, where you can see the selected CUST_ID Group-By attribute is now added to the Group By Attributes table at the top.

Click the icon in the bottom toolbar to open the Add Aggregations dialog, where you can define the aggregations for both QUANTITY_SOLD and AMOUNT_SOLD attributes using the SUM function.

Next, click the icon in the toolbar to open the Edit Sub Group By dialog, where you can specify a Sub-Group By attribute (PROD_ID) to calculate quantity sold and amount sold per product per customer.

Specifying a Sub-Group By column creates a nested table; the nested table contains columns with data type DM_NESTED_NUMERICALS.

Click OK to return to the Aggregate tab, where you can see the defined aggregations are now added to the Aggregation table at the bottom.

Preview Tab

Let’s go to the Preview tab to look at the generated relational output. The Output Columns tab shows all output columns and their corresponding source JSON attributes. The output columns can be renamed by using the in-place edit control.

The Output Data tab shows the actual data in the generated relational output.

Click OK to close the editor when you are done. The generated relational output is single-record case format; each row represents a case. If we had not defined the aggregations for the JSON array attributes, the relational output would have been in multiple-record case format. The multiple-record case format is not suitable for building mining models except for Association model (which accepts transactional data format with transaction id and item id).

Use Case

Here is an example of how JSON Query node is used to project the JSON data source to relational format, so that the data can be consumed by Explore Data node for data analysis and Class Build node for building models.


This blog shows how JSON data can be brought over to Data Miner via the new JSON Query node. Once the data is projected to relational format, it can easily be consumed by Data Miner for graphing, data analysis, text processing, transformation, and modeling.

Wednesday Oct 08, 2014

2014 was a very good year for Oracle Advanced Analytics at Oracle Open World 2014

2014 was a very good year for Oracle Advanced Analytics at Oracle Open World 2014.   We had a number of customer, partner and Oracle talks that focused on the Oracle Advanced Analytics Database Option.    See below with links to presentations.  Check back later to OOW Sessions Content Catalog as not all presentations have been uploaded yet.  :-(

Big Data and Predictive Analytics: Fiserv Data Mining Case Study [CON8631]

Moving data mining algorithms to run as native data mining SQL functions eliminates data movement, automates knowledge discovery, and accelerates the transformation of large-scale data to actionable insights from days/weeks to minutes/hours. In this session, Fiserv, a leading global provider of electronic commerce systems for the financial services industry, shares best practices for turning in-database predictive models into actionable policies and illustrates the use of Oracle Data Miner for fraud prevention in online payments. Attendees will learn how businesses that implement predictive analytics in their production processes significantly improve profitability and maximize their ROI.

Developing Relevant Dining Visits with Oracle Advanced Analytics at Olive Garden [CON2898]

Olive Garden, traditionally managing its 830 restaurants nationally, transitioned to a localized approach with the help of predictive analytics. Using k-means clustering and logistic classification algorithms, it divided its stores into five behavioral segments. The analysis leveraged Oracle SQL Developer 4.0 and Oracle R Enterprise 1.3 to evaluate 115 million transactions in just 5 percent the time required by the company’s BI tool. While saving both time and money by making it possible to develop the solution internally, this analysis has informed Olive Garden’s latest remodel campaign and continues to uncover millions in profits by optimizing pricing and menu assortment. This session illustrates how Oracle Advanced Analytics solutions directly affect the bottom line.

A Perfect Storm: Oracle Big Data Science for Enterprise R and SAS Users [CON8331]

With the advent of R and a rich ecosystem of users and developers, a myriad of bloggers, and thousands of packages with functionality ranging from social network analysis and spatial data analysis to empirical finance and phylogenetics, use of R is on a steep uptrend. With new R tools from Oracle, including Oracle R Enterprise, Oracle R Distribution, and Oracle R Advanced Analytics for Hadoop, users can scale and integrate R for their enterprise big data needs. Come to this session to learn about Oracle’s R technologies and what data scientists from smart companies around the world are doing with R.

Extending the Power of In-Database Analytics with Oracle Big Data Appliance [CON2452]

The need for speed could not be greater—not speed of processing but time to market. The problem is driven by the long journey data takes before evolving into insight. Insight, however, is always relative to assumption. In fact, analytics is often seen as a battle between assumption and data. Assumptions can be classified into three types: related to distributions, ratios, and relations. In this session, you will see how the most-valuable business insights can come in the matter of hours, not months, when assumptions are challenged with data. This is made possible by the integration of Oracle Big Data Appliance, enabling transparent access to in-database analytics from the data warehouse and avoiding the traditional long journey of data to insight.

Market Basket Analysis at Dunkin’ Brands [CON6545]

With almost 120 years of franchising experience, Dunkin’ Brands owns two of the world’s most recognized, beloved franchises: Dunkin’ Donuts and Baskin-Robbins. This session describes a market basket analysis solution built from scratch on the Oracle Advanced Analytics platform at Dunkin’ Brands. This solution enables Dunkin’ to look at product affinity and a host of associated sales metrics with a view to improving promotional effectiveness and cross-sell/up-sell to increase customer loyalty. The presentation discusses the business value achieved and technical challenges faced in scaling the solution to Dunkin’ Brands’ transaction volumes, including engineered systems (Oracle Exadata) hardware and parallel processing at the core of the implementation.

Predictive Analytics with Oracle Data Mining [CON8596]

This session presents three case studies related to predictive analytics with the Oracle Data Mining feature of Oracle Advanced Analytics. Service contracts cancellation avoidance with Oracle Data Mining is about predicting the contracts at risk of cancellation at least nine months in advance. Predicting hardware opportunities that have a high likelihood of being won means identifying such opportunities at least four months in advance to provide visibility into suppliers of required materials. Finally, predicting cloud customer churn involves identifying the customers that are not as likely to renew subscriptions as others.

SQL Is the Best Development Language for Big Data [CON7439]

SQL has a long and storied history. From the early 1980s till today, data processing has been dominated by this language. It has changed and evolved greatly over time, gaining features such as analytic windowing functions, model clauses, and row-pattern matching. This session explores what's new in SQL and Oracle Database for exploiting big data. You'll see how to use SQL to efficiently and effectively process data that is not stored directly in Oracle Database.

Advanced Predictive Analytics for Database Developers on Oracle [CON7977]

Traditional database applications use SQL queries to filter, aggregate, and summarize data. This is called descriptive analytics. The next level is predictive analytics, where hidden patterns are discovered to answer questions that give unique insights that cannot be derived with descriptive analytics. Businesses are increasingly using machine learning techniques to perform predictive analytics, which helps them better understand past data, predict future trends, and enable better decision-making. This session discusses how to use machine learning algorithms such as regression, classification, and clustering to solve a few selected business use cases.

What Are They Thinking? With Oracle Application Express and Oracle Data Miner [UGF2861]

Have you ever wanted to add some data science to your Oracle Application Express applications? This session shows you how you can combine predictive analytics from Oracle Data Miner into your Oracle Application Express application to monitor sentiment analysis. Using Oracle Data Miner features, you can build data mining models of your data and apply them to your new data. The presentation uses Twitter feeds from conference events to demonstrate how this data can be fed into your Oracle Application Express application and how you can monitor sentiment with the native SQL and PL/SQL functions of Oracle Data Miner. Oracle Application Express comes with several graphical techniques, and the presentation uses them to create a sentiment dashboard.

Transforming Customer Experience with Big Data and Predictive Analytics [CON8148]

Delivering a high-quality customer experience is essential for long-term profitability and customer retention in the communications industry. Although service providers own a wealth of customer data within their systems, the sheer volume and complexity of the data structures inhibit their ability to extract the full value of the information. To change this situation, service providers are increasingly turning to a new generation of business intelligence tools. This session begins by discussing the key market challenges for business analytics and continues by exploring Oracle’s approach to meeting these challenges, including the use of predictive analytics, big data, and social network analytics.

There are a few others where Oracle Advanced Analytics is included e.g. Retail GBU, Big Data Strategy, etc. but they are typically more broadly focused.  If you search the Content Catalog for “Advanced Analytics” etc. you can find other related presentations that involve OAA.

Hope this helps.  Enjoy!


Wednesday Aug 06, 2014

New Book: Predictive Analytics Using Oracle Data Miner

Great New Book Now Available:  Predictive Analytics Using Oracle Data Miner, by Brendan Tierney, Oracle ACE Director

If you have an Oracle Database and want to leverage that data to discover new insights, make predictions and generate actionable insights, this book is a must read for you!  In Predictive Analytics Using Oracle Data Miner: Develop & Use Oracle Data Mining Models in Oracle Data Miner, SQL & PL/SQL, Brendan Tierney, Oracle ACE Director and data mining expert, guides the user through the basic concepts of data mining and offers step by step instructions for solving data-driven problems using SQL Developer’s Oracle Data Mining extension.  Brendan takes it full circle by showing the reader how to deploy advanced analytical methodologies and predictive models immediately into enterprise-wide production environments using the in-database SQL and PL/SQL functionality.  

Definitely a must read for any Oracle data professional!

See Predictive Analytics Using Oracle Data Miner, by Brendan Tierney on  

Sunday May 18, 2014

Oracle Data Miner and Oracle R Enterprise Integration - Watch Demo

Oracle Data Miner and Oracle R Enterprise Integration - Watch Demo

Oracle Advanced Analytics (Database EE) Option turns the database into an enterprise-wide analytical platform that can quickly deliver enterprise-wide predictive analytics and actionable insights.  Oracle Advanced Analytics is comprised of both the Oracle Data Mining SQL data mining functions, Oracle Data Miner, an extension to SQL Developer that exposes the data mining SQL functions for data analysts, and Oracle R Enterprise which integrates the R statistical programming language with SQL.  15 powerful in-database SQL data mining functions, the SQL Developer/Oracle Data Miner workflow GUI and the ability to integrate open source R within an analytical methodology, makes the Oracle Database + Oracle Advanced Analytics Option the ideal platform for building and deploying enterprise-wide predictive analytics applications/solutions.  

In Oracle Data Miner 4.0 we added a new SQL Query node to allow users to insert arbitrary SQL scripts within an ODMr analytical workflow. Additionally, the SQL Query node allows users to leverage registered R scripts to extend Oracle Data Miner's analytical capabilities.  For applications that are mostly OAA/Oracle Data Mining SQL data mining functions based but require additional analytical techniques found in the R community, this is an ideal method for integrating the power of in-database SQL analytical and data mining functions with the flexibility of open source R.  For applications that are built entirely using the R statistical programming language, it may be more practical to stay within the R console or RStudio environments, but for SQL-centric in-database predictive methodologies, this integration is just what might satisfy your needs.

Watch this Oracle Data Miner and Oracle R Enteprise Integration YouTube to see the demo. 

There is an excellent related Oracle Data Miner:  Integrate Oracle R Enterprise Algorithms into workflow using the SQL Query node (pdf, companion files) white paper on this topic that includes examples on the Oracle Technology Network in the Oracle Data Mining pages.  

Tuesday May 06, 2014

Oracle Data Miner 4.0/SQLDEV 4.0 New Features - Watch Demo!

Oracle Data Miner 4.0 New Features 

Oracle Data Miner/SQLDEV 4.0 (for Oracle Database 11g and 12c)

  • New Graph node (box, scatter, bar, histograms)
  • SQL Query node + integration of R scripts
  • Automatic SQL script generation for deployment

Oracle Advanced Analytics 12c New SQL data mining algorithms/enhancements features exposed in Oracle Data Miner 4.0

  • Expectation Maximization Clustering algorithm
  • PCA & Singular Vector Decomposition algorithms
  • Decision Trees can also now mine unstructured data
  • Improved/automated Text Mining, Prediction Details and other algorithm improvements
  • SQL Predictive Queries—automatic build, apply within simple yet powerful SQL query

Tuesday Nov 12, 2013

Oracle Big Data Learning Library

Click on LEARN BY PRODUCT to view all learning resources.

Oracle Big Data Essentials

Attend this Oracle University Course!

Using Oracle NoSQL Database

Attend this Oracle University class!

Oracle and Big Data on OTN

See the latest resource on OTN.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday Sep 04, 2013

Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into workflow using the SQL Query node

I posted a new white paper authored by Denny Wong, Principal Member of Technical Staff, User Interfaces and Components, Oracle Data Mining Technologies.  You can access the white paper here and the companion files here.  Here is an excerpt:

Oracle Data Miner (Extension of SQL Developer 4.0) 

Integrate Oracle R Enterprise Mining Algorithms into workflow using the SQL Query node

Oracle R Enterprise (ORE), a component of the Oracle Advanced Analytics Option, makes the open source R statistical programming language and environment ready for the enterprise and big data. Designed for problems involving large amounts of data, Oracle R Enterprise integrates R with the Oracle Database. R users can develop, refine and deploy R scripts that leverage the parallelism and scalability of the database to perform predictive analytics and data analysis.

Oracle Data Miner (ODMr) offers a comprehensive set of in-database algorithms for performing a variety of mining tasks, such as classification, regression, anomaly detection, feature extraction, clustering, and market basket analysis. One of the important capabilities of the new SQL Query node in Data Miner 4.0 is a simplified interface for integrating R scripts registered with the database. This provides the support necessary for R Developers to provide useful mining scripts for use by data analysts. This synergy provides many additional benefits as noted below.

· R developers can further extend ODMr mining capabilities by incorporating the extensive R mining algorithms from the open source CRAN packages or leveraging any user developed custom R algorithms via SQL interfaces provided by ORE.

· Since this SQL Query node can be part of a workflow process, R scripts can leverage functionalities provided by other workflow nodes which can simplify the overall effort of integrating R capabilities within the database.

· R mining capabilities can be included in the workflow deployment scripts produced by the new sql script generation feature. So the ability of deploy R functionality within the context of an Data Miner workflow is easily accomplished.

· Data and processing are secured and controlled by the Oracle Database. This alleviates a lot of risk that are incurred by other providers, when users have to export data out of the database in order to perform advanced analytics.

Oracle Advanced Analytics saves analysts, developers, database administrators and management the headache of trying to integrate R and database analytics. Instead, users can quickly gain the benefit of new R analytics and spend their time and effort on developing business solutions instead of building homegrown analytical platforms.

This paper should be very useful to R developers wishing to better understand how to leverage imbedding R Scripts for use by Data Analysts.  Analysts will also find the paper useful to see how R features can be surfaced for their use in Data Miner. The specific use case covered demonstrates how to use the SQL Query node to integrate R glm and rpart regression model build, test, and score operations into the workflow along with nodes that perform data preparation and residual plot graphing. However, the integration process described here can easily be adapted to integrate other R operations like statistical data analysis and advanced graphing to expand ODMr functionalities.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Monday Jul 15, 2013

Oracle Data Miner GUI, part of SQL Developer 4.0 Early Adopter 1 is now available for download on OTN

The NEW Oracle Data Miner GUI, part of SQL Developer 4.0 Early Adopter 1 is now available for download on OTN.  See link to SQL Developer 4.0 EA1.   

The Oracle Data Miner 4.0 New Features are applicable to Oracle Database 11g Release 2 and Oracle Database Release 12c:  See Oracle Data Miner Extension to SQL Developer 4.0 Release Notes for EA1 for additional information  

· Workflow SQL Script Deployment

o Generates SQL scripts to support full deployment of workflow contents

· SQL Query Node

o Integrate SQL queries to transform data or provide a new data source

o Supports the running of R Language Scripts and viewing of R generated data and graphics

· Graph Node

o Generate Line, Scatter, Bar, Histogram and Box Plots

· Model Build Node Improvements

o Node level data usage specification applied to underlying models

o Node level text specifications to govern text transformations

o Displays heuristic rules responsible for excluding predictor columns

o Ability to control the amount of Classification and Regression test results generated

· View Data

o Ability to drill in to view custom objects and nested tables

These new Oracle Data Miner GUI capabilities expose Oracle Database 12c and Oracle Advanced Analytics/Data Mining Release 1 features:

· Predictive Query Nodes

o Predictive results without the need to build models using Analytical Queries

o Refined predictions based on data partitions

· Clustering Node New Algorithm

o Added Expectation Maximization algorithm

· Feature Extraction Node New Algorithms

o Added Singular Value Decomposition and Principal Component Analysis algorithms

· Text Mining Enhancements

o Text transformations integrated as part of Model's Automatic Data Preparation

o Ability to import Build Text node specifications into a Model Build node

· Prediction Result Explanations

o Scoring details that explain predictive result

· Generalized Linear Model New Algorithm Settings

o New algorithm settings provide feature selection and generation

See OAA on OTN pages for more information on Oracle Advanced Analytics.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday May 08, 2013

Oracle Advanced Analytics and Data Mining at the Movies on YouTube - Updated August 3, 2015

Updated August 3, 2015

Periodically, I've recorded a demonstration and/or presentation on Oracle Advanced Analytics and Data Mining and have posted them on YouTube.

Here are links to some of more recent YouTube postings--sort of an Oracle Advanced Analytics and Data Mining at the Movies experience.

  1. New Big Data Analyics using Oracle Advanced Analytics12c and Big Data SQL  - Watch on YouTube
  2. New Oracle Academy Webcast:  Ask the Oracle Experts Fraud &  Anomaly Detection using Oracle Advanced Analytics 12c & Big Data SQL - Watch YouTube
  3. New - Oracle Academy Webcast:  Ask the Oracle Experts Big Data Analytics with Oracle Advanced Analytics - Watch YouTube
  4. Oracle Data Miner and Oracle R Enterprise Integration via SQL Query node - Watch Demo
  5. Oracle Data Miner 4.0 (SQL Developer 4.0 Extension) New Features - Watch Demo
  6. Oracle Business Intelligence Enterprise Edition (OBIEE) SampleAppls Demo featuring integration with Oracle Advanced Analytics/Data Mining
  7. Oracle Big Data Analytics Demo mining remote sensor data from HVACs for better customer service 
  8. In-Database Data Mining for Retail Market Basket Analysis Using Oracle Advanced Analytics
  9. In-Database Data Mining Using Oracle Advanced Analytics for Classification using Insurance Use Case
  10. Fraud and Anomaly Detection using Oracle Advanced Analytics Part 1 Concepts
  11. Fraud and Anomaly Detection using Oracle Advanced Analytics Part 2 Demo
  12. Overview Presentation and Demonstration of Oracle Advanced Analytics Database Option

So.... grab your popcorn and a comfortable chair.  Hope you enjoy!


Oracle Advanced Analytics at the Movies

Friday Feb 22, 2013

Take a FREE Test Drive with Oracle Advanced Analytics/Data Mining on the Amazon Cloud

I wanted to highlight a wonderful new resource provided by our partner Vlamis Software.  Extremely easy!  Fill out the form, wait a few minutes for the Amazon Cloud instance to start up and them BAM!  You can login and start using the Oracle Advanced Analytics Oracle Data Miner work flow GUI.  Demo data and online Oracle by Example Learning Tutorials are also provided to ensure your data mining test drive is a positive one,  Enjoy!! 

Test Drive -- Powered by Amazon AWS

We have partnered with Amazon Web Services to provide to you, free of charge, the opportunity to work, hands-on, with the latest of Oracle's Business Intelligence offerings. By signing up to one of the labs below, Amazon's Elastic Cloud Computer (EC2) environment will generate a complete server for you to work with.

These hands on labs are working with the actual Oracle software running on the Amazon Web Services EC2 environment. They each take approximately 2 hours to work through and will give you hands-on experience with the software and a tour of the features. Your EC2 environment will be available for you for 5 hours, at which time it will self-terminate. If, after registration, you need additional time or need further instructions, simply reply to the registration email and we would be glad to help you.

Data Mining

This test drive walks through some basic exercises in doing predictive analytics within an Oracle 11g Database instance using the Oracle Data Miner extension for Oracle SQL Developer. You use a drag-and-drop "workflow" interface to build a data mining model that predicts the likelihood of purchase for a set of prospects. Oracle Data Mining is ideal for automatically finding patterns, understanding relationships, and making predictions in large data sets.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Tuesday Jan 01, 2013

Turkcell Combats Pre-Paid Calling Card Fraud Using In-Database Oracle Advanced Analytics

Turkcell İletişim Hizmetleri A.S. Successfully Combats Communications Fraud with Advanced In-Database Analytics

[Original link available on]

Turkcell İletişim Hizmetleri A.Ş. is a leading provider of mobile communications in Turkey with more than 34 million subscribers. Established in 1994, Turkcell created the first global system for a mobile communications (GSM) network in Turkey. It was the first Turkish company listed on the New York Stock Exchange.

Communications fraud, or the  use of telecommunications products or services without intention to pay, is a major issue for the organization. The practice is fostered by prepaid card usage, which is growing rapidly. Anonymous network-branded prepaid cards are a tempting vehicle for money launderers, particularly since these cards can be used as cash vehicles—for example, to withdraw cash at ATMs. It is estimated that prepaid card fraud represents an average loss of US$5 per US$10,000 in transactions. For a communications company with billions of transactions, this could result in millions of dollars lost through fraud every year.

Consequently, Turkcell wanted to combat communications fraud and money laundering by introducing advanced analytical solutions to monitor key parameters of prepaid card usage and issue alerts or block fraudulent activity. This type of fraud prevention would require extremely fast analysis of the company’s one petabyte of uncompressed customer data to identify patterns and relationships, build predictive models, and apply those models to even larger data volumes to make accurate fraud predictions.

To achieve this, Turkcell deployed Oracle Exadata Database Machine X2-2 HC Full Rack, so that data analysts can build predictive antifraud models inside the Oracle Database and deploy them into Oracle Exadata for scoring, using Oracle Data Mining, a component of Oracle Advanced Analytics, leveraging Oracle Database11g technology. This enabled the company to create predictive antifraud models faster than with any other machine, as models can be built using search and query language (SQL) inside the database, and Oracle Exadata can access raw data without summarized tables, thereby achieving extremely fast analyses.


A word from Turkcell İletişim Hizmetleri A.Ş.

“Turkcell manages 100 terabytes of compressed data—or one petabyte of uncompressed raw data—on Oracle Exadata. With Oracle Data Mining, a component of the Oracle Advanced Analytics Option, we can analyze large volumes of customer data and call-data records easier and faster than with any other tool and rapidly detect and combat fraudulent phone use.” – Hasan Tonguç Yılmaz, Manager, Turkcell İletişim Hizmetleri A.Ş.

  • Combat communications fraud and money laundering by introducing advanced analytical solutions to monitor prepaid card usage and alert or block suspicious activity
  • Monitor numerous parameters for up to 10 billion daily call-data records and value-added service logs, including the number of accounts and cards per customer, number of card loads per day, number of account loads over time, and number of account loads on a subscriber identity module card at the same location
  • Enable extremely fast sifting through huge data volumes to identify patterns and relationships, build predictive antifraud models, and apply those models to even larger data volumes to make accurate fraud predictions
  • Detect fraud patterns as soon as possible and enable quick response to minimize the negative financial impact


Oracle Product and Services

  • Used Oracle Exadata Database Machine X2-2 HC Full Rack to create predictive antifraud models more quickly than with previous solutions by accessing raw data without summarized tables and providing unmatched query speed, which optimizes and shortens the project design phases for creating predictive antifraud models
  • Leveraged SQL for the preparation and transformation of one petabyte of uncompressed raw communications data, using Oracle Data Mining, a feature of Oracle Advanced Analytics to increase the performance of predictive antifraud models
  • Deployed Oracle Data Mining models on Oracle Exadata to identify actionable information in less time than traditional methods—which would require moving large volumes of customer data to a third-party analytics software—and achieve an average gain of four hours and more, taking into consideration the absence of any system crash (as occurred in the previous environment) during data import
  • Achieved extreme data analysis speed with in-database analytics performed inside Oracle Exadata, through a row-wise information search—including day, time, and duration of calls, as well as number of credit recharges on the same day or at the same location—and query language functions that enabled analysts to detect fraud patterns almost immediately
  • Implemented a future-proof solution that could support rapidly growing data volumes that tend to double each year with Oracle Exadata’s massively scalable data warehouse performance

Why Oracle

“We selected Oracle because in-database mining to support antifraud efforts will be a major focus for Turkcell in the future. With Oracle Exadata Database Machine and the analytics capabilities of Oracle Advanced Analytics, we can complete antifraud analysis for large amounts of call-data records in just a few hours. Further, we can scale the solution as needed to support rapid communications data growth,” said Hasan Tonguç Yılmaz, datawarehouse/data mining developer, Turkcell Teknoloji Araştırma ve Geliştirme A.Ş.


Oracle Partner: Turkcell Teknoloji Araştırma ve Geliştirme A.Ş.

All development and test processes were performed by Turkcell Teknoloji. The company also made significant contributions to the configuration of numerous technical analyses which are carried out regularly by Turkcell İletişim Hizmetleri's antifraud specialists.


<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Tuesday May 29, 2012

Fraud and Anomaly Detection using Oracle Data Mining YouTube-like Video

I've created and recorded another YouTube-like presentation and "live" demos of Oracle Advanced Analytics Option, this time focusing on Fraud and Anomaly Detection using Oracle Data Mining.  [Note:  It is a large MP4 file that will open and play in place.  The sound quality is weak so you may need to turn up the volume.]

Data is your most valuable asset. It represents the entire history of your organization and its interactions with your customers.  Predictive analytics leverages data to discover patterns, relationships and to help you even make informed predictions.   Oracle Data Mining (ODM) automatically discovers relationships hidden in data.  Predictive models and insights discovered with ODM address business problems such as:  predicting customer behavior, detecting fraud, analyzing market baskets, profiling and loyalty.  Oracle Data Mining, part of the Oracle Advanced Analytics (OAA) Option to the Oracle Database EE, embeds 12 high performance data mining algorithms in the SQL kernel of the Oracle Database. This eliminates data movement, delivers scalability and maintains security. 

But, how do you find these very important needles or possibly fraudulent transactions and huge haystacks of data? Oracle Data Mining’s 1 Class Support Vector Machine algorithm is specifically designed to identify rare or anomalous records.  Oracle Data Mining's 1-Class SVM anomaly detection algorithm trains on what it believes to be considered “normal” records, build a descriptive and predictive model which can then be used to flags records that, on a multi-dimensional basis, appear to not fit in--or be different.  Combined with clustering techniques to sort transactions into more homogeneous sub-populations for more focused anomaly detection analysis and Oracle Business Intelligence, Enterprise Applications and/or real-time environments to "deploy" fraud detection, Oracle Data Mining delivers a powerful advanced analytical platform for solving important problems.  With OAA/ODM you can find suspicious expense report submissions, flag non-compliant tax submissions, fight fraud in healthcare claims and save huge amounts of money in fraudulent claims  and abuse.  

This presentation and several brief demos will show Oracle Data Mining's fraud and anomaly detection capabilities.  

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Thursday May 10, 2012

Oracle Virtual SQL Developer Days DB May 15th - Session #3: 1Hr. Predictive Analytics and Data Mining Made Easy!


Oracle Data Mining's SQL Developer based ODM'r GUI + ODM is being featured in this upcoming Virtual SQL Developer Day online event next Tuesday, May 15th.  Several thousand people have already registered and registration is still growing.  We recorded and uploaded presentations/demos and then anyone can view them "on demand", but at the specified date/time per the SQL DD event agenda.  Anyone can also download a complete 11gR2 Database w/ SQL Developer 3.1 & Oracle Data Miner GUI extension VM installation for the Hands-on Labs and follow our 4 ODM Oracle by Examples e-training.  We moderators monitor the online chat and answer questions. 
Session #3: 1Hr. Predictive Analytics and Data Mining Made Easy!
Oracle Data Mining, a component of the Oracle Advanced Analytics database option, embeds powerful data mining algorithms in the SQL kernel of the Oracle Database for problems such as customer churn, predicting customer behavior, up-sell and cross-sell, detecting fraud, market basket analysis (e.g. beer & diapers), customer profiling and customer loyalty. Oracle Data Miner, SQL Developer 3.1 extension, provides data analysts a “workflow” paradigm to build analytical methodologies to explore data and build, evaluate and apply data mining models—all while keeping the data inside the Oracle Database. This workshop will teach the student the basics of getting started using Oracle Data Mining.
We're also included in the June 7th physical event in NYC and future virtual and physical events.  Great event(s) and great "viz" for OAA/ODM.


Oracle Data Mining Virtual Classes Scheduled

Two Oracle Data Mining Virtual Classes are now scheduled.  Register for a course in 2 easy steps.

Step 1: Select your Live Virtual Class options


Live Virtual Class
Course ID: D76362GC10
Course Title: Oracle Database 11g: Data Mining Techniques
Duration: 2 Days
Price: US$ 1,300 Dollars

Step 2: Select the date and location of your Live Virtual Class

Please select a location below then click on the Add to Cart button


Location  Duration Class Date Class Start Time Class End Time Course Materials Instruction Language Seats Audience Employees
Online 2 Days 09-Aug-2012 04:00 AM EDT 12:00 PM EDT English English Available Public Employees
Online 2 Days 18-Oct-2012 04:00 AM EDT 12:00 PM EDT English English Available Public Employees

100% Student Satisfaction: Oracle's 100% Student Satisfaction program applies to those publicly scheduled and publicly available Oracle University Instructor Led Training classes that are identified as part of the 100% Student Satisfaction program on the website at the time the class is purchased. Oracle will permit unsatisfied students to retake the class, subject to terms and conditions. Customers are not entitled to a refund. For more information and additional terms, conditions and restrictions that apply, click here

Wednesday Apr 04, 2012

Recorded YouTube-like presentation and "live" demos of Oracle Advanced Analytics/Oracle Data Mining

Ever want to just sit and watch a YouTube-like presentation and "live" demos of Oracle Advanced Analytics/Oracle Data Mining?  Then click here! (plays large MP4 file in a browser)

This 1+ hour long session focuses primarily on the Oracle Data Mining component of the Oracle Advanced Analytics Option and is tied to the Oracle SQL Developer Days virtual and onsite events.   I cover:

  • Big Data + Big Data Analytics
  • Competing on analytics & value proposition
  • What is data mining?
  • Typical use cases
  • Oracle Data Mining high performance in-database SQL based data mining functions
  • Exadata "smart scan" scoring
  • Oracle Data Miner GUI (an Extension that ships with SQL Developer)
  • Oracle Business Intelligence EE + Oracle Data Mining results/predictions in dashboards
  • Applications "powered by Oracle Data Mining for factory installed predictive analytics methodologies
  • Oracle R Enterprise

Please contact should you have any questions.  Hope you enjoy! 

Charlie Berger, Sr. Director of Product Management, Oracle Data Mining & Advanced Analytics, Oracle Corporation

Friday Mar 23, 2012

NEW 2-Day Instructor Led Course on Oracle Data Mining Now Available!

A NEW 2-Day Instructor Led Course on Oracle Data Mining has been developed for customers and anyone wanting to learn more about data mining, predictive analytics and knowledge discovery inside the Oracle Database.  To register interest in attending the class, click here and submit your preferred format.

Course Objectives:

  • Explain basic data mining concepts and describe the benefits of predictive analysis
  • Understand primary data mining tasks, and describe the key steps of a data mining process
  • Use the Oracle Data Miner to build,evaluate, and apply multiple data mining models
  • Use Oracle Data Mining's predictions and insights to address many kinds of business problems, including: Predict individual behavior, Predict values, Find co-occurring events
  • Learn how to deploy data mining results for real-time access by end-users

Five reasons why you should attend this 2 day Oracle Data Mining Oracle University course. With Oracle Data Mining, a component of the Oracle Advanced Analytics Option, you will learn to gain insight and foresight to:

  • Go beyond simple BI and dashboards about the past. This course will teach you about "data mining" and "predictive analytics", analytical techniques that can provide huge competitive advantage
  • Take advantage of your data and investment in Oracle technology
  • Leverage all the data in your data warehouse, customer data, service data, sales data, customer comments and other unstructured data, point of sale (POS) data, to build and deploy predictive models throughout the enterprise.
  • Learn how to explore and understand your data and find patterns and relationships that were previously hidden
  • Focus on solving strategic challenges to the business, for example, targeting "best customers" with the right offer, identifying product bundles, detecting anomalies and potential fraud, finding natural customer segments and gaining customer insight.

UDDATED for Oracle Database 12c & SQLDEV 4.0: Evaluating Oracle Data Mining Has Never Been Easier - Evaluation "Kit" Available

UPDATED (October 2015) for ORACLE DATABASE 12c & SQL DEVELOPER 4.1 (with ORACLE DATA MINER 4.1)  Extension

The Oracle Advanced Analytics Option turns the database into an enterprise-wide analytical platform that can quickly deliver enterprise-wide predictive analytics and actionable insights. Oracle Advanced Analytics empowers data and business analysts to extract knowledge, discover new insights and make predictions—working directly with large data volumes in the Oracle Database. Oracle Advanced Analytics, an Option of Oracle Database Enterprise Edition, offers a combination of powerful in-database algorithms and integration with open source R algorithms accessible via SQL and R languages and provides a range of GUI (Oracle Data Miner) and IDE (R client, RStudio, etc.) options targeting business users, data analysts, application developers and data scientists.

Now you can quickly and easily get set up to starting using Oracle Data Mining, the SQL API & GUI component of the Oracle Advanced Analytics Database Option for evaluation purposes. Just go to the Oracle Technology Network (OTN) and follow these simple steps.

Oracle Data Mining Evaluation "Kit" Instructions

Step 1: Download and Install the Oracle Database 12c

  • Anyone can download and install the Oracle Database for free for evaluation purposes. Read OTN web site for details.
  • Oracle Database 12c is the latest release and contains many new features.  See Oracle Advanced Analytics 12c Dcoumentation's New Features and this recent Oracle Data Mining Blog posting.  NOTE:  A major new feature of the 12c Oracle Database is multi-tenant and the ability to set up multiple Container databases.  However, to keep things simpler, UNCHECK the "create as Container database" option.  This makes your SQLDEV database connections simpler and then you can use the simpler case Oracle Data Miner tutorials.  If you create the Container database(s), your connection details get a bit more complicated.  
  • For  Oracle Database Release 11g, DB is the minimum, is better and naturally is best if you are a current customer and on active support.
  • Either 32-bit or 64-bit is fine. 4GB of RAM or more works fine for SQL Developer and the Oracle Data Miner GUI extension.
  • Downloading the database and then installing it should take just about an hour or so at most, depending on your network and computer.
  • For more instructions on setting up Oracle Data Mining see:
  • When you install the Oracle Database, the Oracle Data Mining Examples including sample data is available as part of the total Database installation.  See link.  

Step 2: Install SQL Developer 4.1 (the Oracle Data Miner GUI Extension installs automatically but additional post installation Set Up in required.  See Setting Up Oracle Data Miner )

Step 3: Follow the six (6) free step-by-step Oracle-by-Examples Tutorials:

That’s it!  Easy, fun and the fastest way to get started evaluating Oracle Advanced Analytics/Oracle Data Mining.  Enjoy!  


<script type="text/freezescript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + ''; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday Feb 08, 2012

Oracle Announces Availability of Oracle Advanced Analytics for Big Data

Oracle Announces Availability of Oracle Advanced Analytics for Big Data

Oracle Integrates R Statistical Programming Language into Oracle Database 11g

REDWOOD SHORES, Calif. - February 8, 2012

News Facts

  • Oracle today announced the availability of     Oracle Advanced Analytics, a new option for Oracle Database 11g that bundles Oracle R Enterprise together with Oracle Data Mining.
  • Oracle R Enterprise delivers enterprise class performance for users of the R statistical programming language, increasing the scale of data that can be analyzed by orders of magnitude using Oracle Database 11g.
  • R has attracted over two million users since its introduction in 1995, and Oracle R Enterprise dramatically advances capability for R users. Their existing R development skills, tools, and scripts can now also run transparently, and scale against data stored in Oracle Database 11g.
  • Customer testing of Oracle R Enterprise for Big Data analytics on Oracle Exadata has shown up to 100x increase in performance in comparison to their current environment.
  • Oracle Data Mining, now part of Oracle Advanced Analytics, helps enable customers to easily build and deploy predictive analytic applications that help deliver new insights into business performance. Oracle Advanced Analytics, in conjunction with Oracle Big Data Appliance, Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine, delivers the industry’s most integrated and comprehensive platform for Big Data analytics.

Comprehensive In-Database Platform for Advanced Analytics

  • Oracle Advanced Analytics brings analytic algorithms to data stored in Oracle Database 11g and Oracle Exadata as opposed to the traditional approach of extracting data to laptops or specialized servers.
  • With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.
  • By providing direct and controlled access to data stored in Oracle Database 11g, customers can accelerate data analyst productivity while maintaining data security throughout the enterprise.
  • Powered by decades of Oracle Database innovation, Oracle R Enterprise helps enable analysts to run a variety of sophisticated numerical techniques on billion row data sets in a matter of seconds making iterative, speed of thought, and high-quality numerical analysis on Big Data practical.
  • Oracle R Enterprise drastically reduces the time to deploy models by eliminating the need to translate the models to other languages before they can be deployed in production.
  • Oracle R Enterprise integrates the extensive set of Oracle Database data mining algorithms, analytics, and access to Oracle OLAP cubes into the R language for transparent use by R users.
  • Oracle Data Mining provides an extensive set of in-database data mining algorithms that solve a wide range of business problems. These predictive models can be deployed in Oracle Database 11g and use Oracle Exadata Smart Scan to rapidly score huge volumes of data.
  • The tight integration between R, Oracle Database 11g, and Hadoop enables R users to write one R script that can run in three different environments: a laptop running open source R, Hadoop running with Oracle Big Data Connectors, and Oracle Database 11g.
  • Oracle provides single vendor support for the entire Big Data platform spanning the hardware stack, operating system, open source R, Oracle R Enterprise and Oracle Database 11g. To enable easy enterprise-wide Big Data analysis, results from Oracle Advanced Analytics can be viewed from Oracle Business Intelligence Foundation Suite and Oracle Exalytics In-Memory Machine.

Supporting Quotes

  • “Oracle is committed to meeting the challenges of Big Data analytics. By building upon the analytical depth of Oracle SQL, Oracle Data Mining and the R environment, Oracle is delivering a scalable and secure Big Data platform to help our customers solve the toughest analytics problems,” said Andrew Mendelsohn, senior vice president, Oracle Server Technologies.
  • “We work with leading edge customers who rely on us to deliver better BI from their Oracle Databases. The new Oracle R Enterprise functionality allows us to perform deep analytics on Big Data stored in Oracle Databases. By leveraging R and its library of open source contributed CRAN packages combined with the power and scalability of Oracle Database 11g, we can now do that,” said Mark Rittman, co-founder, Rittman Mead.

Supporting Resources

About Oracle

Oracle engineers hardware and software to work together in the cloud and in your data center. For more information about Oracle (NASDAQ: ORCL), visit


Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Contact Info

Eloy Ontiveros

Joan Levy
Blanc & Otus for Oracle

Monday Sep 19, 2011

Building Predictive Analytical Applications using Oracle Data Mining recorded webcast

I did a Building Predictive Analytical Applications using Oracle Data Mining recorded webcast for IOUG earlier this week.  If this interests you, you can either watch the streaming presentation and demo hosted by the Independent Oracle User Group at in conjunction with the Oracle Business Intelligence, Wartehousing and Analytics Special Interest Group (BIWA SIG at or the download the 84 MB file by clicking on this link to the Building Predictive Analytical Applications using Oracle Data Mining.wmv file.   It included an overview of data mining, Oracle Data Mining, some demo slides, and then several example applications where we've factory-installed ODM predictive analytics methodologies into the Appls for self-learning and real-time deployment of ODM models. 

Example Predictive Analytics Applications (partial list)
  • Oracle Communications  & Retail Industry Models —factory installed data mining for specific industries
  • Oracle Spend Classification
  • Oracle Fusion Human Capital Management (HCM)  Predictive Workforce
  • Oracle Fusion Customer Relationship Management (CRM) Sales Prediction
  • Oracle Adaptive Access Manager real-time Security
  • Oracle Complex Event Processing integrated with ODM models
  • Predictive Incident Monitoring Service for Oracle Database customers

Pretty cool stuff if you or your customers are interested in analytics.  Here's the link to the ppt slides.    

Tuesday Aug 09, 2011

SAIL-WORLD article - America's Cup: Oracle Data Mining supports crew and BMW ORACLE Racing

Originally printed at

America's Cup: Oracle Data Mining supports crew and BMW ORACLE Racing

  <script language="freezescript"> </script>
'USA-17 on her way to winning the 33rd America’s Cup, use of Oracle’s datamining technology and Oracle Database 11g and Oracle Application Express'    BMW Oracle Racing © Photo Gilles Martin-Raget    Click Here to view large photo

BMW ORACLE Racing won the 33rd America’s Cup yacht race in February 2010, beating the Swiss team, Alinghi, decisively in the first two races of the best-of-three contest.

BMW ORACLE Racing’s victory in the America’s Cup challenge was a lesson in sailing skill, as one of the world’s most experienced crews reached speeds as fast as 30 knots. But if you listen to the crew in their postrace interviews, you’ll notice that what they talk about is technology.

The wrist PDA displays worn by five of the USA-17 crew - where they could read actual and predictive data fed back from the onboard systems -  .. .   Click Here to view large photo

'The story of this race is in the technology,' says Ian Burns, design coordinator for BMW ORACLE Racing.

From the drag-resistant materials encasing its hulls to its unprecedented 223-foot wing sail, the BMW ORACLE Racing’s trimaran, named USA, is a one-of-a-kind technological juggernaut. No less impressive are the electronics used to guide the vessel and fine-tune its performance.

Each crewmember is equipped with a PDA on his wrist that has customized data for his job: what the load balance is on a particular rope, for example, or the current aerodynamic performance of the wing sail. The helmsman’s sunglasses display graphical and numeric data to help him fine-tune the boat’s direction while he keeps two hands on the wheel and visually scans the sea, the boat, the crew, the sails, and the wing.

The America’s Cup is a challenge-based competition in which the winning yacht club hosts the next event and, within certain guidelines, makes the rules. For the 33rd America’s Cup, the competing teams could not agree on a set of rules, so the event defaulted to an unrestricted format for boat design and cost.

'All we knew were the length of the boat and the course configuration,' says Burns. The boats were allowed a maximum length of 90 feet, and the course would be 20 miles out windward and 20 miles back. 'Within those parameters,' says Burns, 'you could build as fast a thing as you can think of.'

Learning by Data

The no-holds-barred rules for this race created what Burns calls an 'open playground' for boat designers. The innovative and costly vessels that resulted were one-of-a-kind creations with unpredictable sailing characteristics that would require a steep learning curve and lots of data.

33rd America’s Cup - BMW ORACLE Racing - Training in Valencia - collecting data via 250 sensors, managing it and analysing it were handled on the yacht, on the tender and ashore in Valencia and in the Austin Data Centre, USA. -  BMW Oracle Racing © Photo Gilles Martin-Raget   Click Here to view large photo

'One of the problems we faced at the outset was that we needed really high accuracy in our data because we didn’t have two boats,' says Burns. 'Generally, most teams have two boats, and they sail them side by side. Change one thing on one boat, and it’s fairly easy to see the effect of a change with your own eyes.'

With only one boat, BMW ORACLE Racing’s performance analysis had to be done numerically by comparing data sets. To get the information needed, says Burns, the team had to increase the amount of data collected by nearly 40 times what they had done in the past.

The USA holds 250 sensors to collect raw data: pressure sensors on the wing; angle sensors on the adjustable trailing edge of the wing sail to monitor the effectiveness of each adjustment, allowing the crew to ascertain the amount of lift it’s generating; and fiber-optic strain sensors on the mast and wing to allow maximum thrust without overbending them.

33rd America’s Cup - BMW ORACLE Racing - Day 1 - The difference between the wingsail and softsail is evident - even though the softsail has more area -  BMW Oracle Racing: Guilain Grenier   Click Here to view large photo

But collecting data was only the beginning. BMW ORACLE Racing also had to manage that data, analyze it, and present useful results. The team turned to Oracle Data Mining in Oracle Database 11g.

Peter Stengard, a principal software engineer for Oracle Data Mining and an amateur sailor, became the liaison between the database technology team and BMW ORACLE Racing. 'Ian Burns contacted us and explained that they were interested in better understanding the performance-driving parameters of their new boat,' says Stengard. 'They were measuring an incredible number of parameters across the trimaran, collected 10 times per second, so there were vast amounts of data available for analysis. An hour of sailing generates 90 million data points.'

After each day of sailing the boat, Burns and his team would meet to review and share raw data with crewmembers or boat-building vendors using a Web application built with Oracle Application Express. 'Someone in the meeting would say, 'Wouldn’t it be great if we could look at some new combination of numbers?’ and we could quickly build an Oracle Application Express application and share the information during the same meeting,' says Burns. Later, the data would be streamed to Oracle’s Austin Data Center, where Stengard and his team would go to work on deeper analysis.

BMW Oracle USA-17 powers thru Alinghi - America’s Cup 2010 Race 1 -  BMW Oracle Racing © Photo Gilles Martin-Raget   Click Here to view large photo

Because BMW ORACLE Racing was already collecting its data in an Oracle database, Stengard and his team didn’t have to do any extract, transform, and load (ETL) processes or data conversion. 'We could just start tackling the analytics problem right away,' says Stengard. 'We used Oracle Data Mining, which is in Oracle Database. It gives us many advanced data mining algorithms to work with, so we have freedom in how we approach any specific task.'

Using the algorithms in Oracle Data Mining, Stengard could help Burns and his team learn new things about how their boat was working in its environment. 'We would look, for example, at mast rotations—which rotation works best for certain wind conditions,' says Stengard. 'There were often complex relationships within the data that could be used to model the effect on the target—in this case something called velocity made good, or VMG. Finding these relationships is what the racing team was interested in.'

BMW Oracle Racing Technology team -  Richard Gladwell   Click Here to view large photo

Stengard and his team could also look at data over time and with an attribute selection algorithm to determine which sensors provided the most-useful information for their analysis. 'We could identify sensors that didn’t seem to be providing the predictive power they were looking for so they could change the sensor location or add sensors to another part of the boat,' Stengard says.

Burns agrees that without the data mining, they couldn’t have made the boat run as fast. 'The design of the boat was important, but once you’ve got it designed, the whole race is down to how the guys can use it,' he says. 'With Oracle database technology, we could compare our performance from the first day of sailing to the very last day of sailing, with incremental improvements the whole way through. With data mining we could check data against the things we saw, and we could find things that weren’t otherwise easily observable and findable.'

BMW Oracle Racing made 4000 data measurements 10 times a second -  BMW Oracle Racing: Guilain Grenier   Click Here to view large photo

Flying by Data

The greatest challenge of this America’s Cup, according to Burns, was managing the wing sail, which had been built on an unprecedented scale. 'It is truly a massive piece of architecture,' Burns says. 'It’s 20 stories high; it barely fits under the Golden Gate Bridge. It’s an amazing thing to see.'

The wing sail is made of an aeronautical fabric stretched over a carbon fiber frame, giving it the three-dimensional shape of a regular airplane wing. Like an airplane wing, it has a fixed leading edge and an adjustable trailing edge, which allows the crew to change the shape of the sail during the course of a race.

Oracle wing under maintenance - standing 70 metres high it is the longest wing ever build for a plane or yacht -  Jean Philippe Jobé   Click Here to view large photo

Next Steps

'The crew of the USA was the best group of sailors in the world, but they were used to working with sails,' says Burns, 'Then we put them under a wing. Our chief designer, Mike Drummond, told them an airline pilot doesn’t look out the window when he’s flying the plane; he looks at his instruments, and you guys have to do the same thing.'

A second ship, known as the performance tender, accompanied the USA on the water. The tender served in part as a floating datacenter and was connected to the USA by wireless LAN.

USA-17 about to round the windward mark, Race 1, 33rd America’s Cup. Under performing senors on the boat were moved to provide better information. -  Richard Gladwell  

'The USA generates almost 4,000 variables 10 times a second,' says Burns. 'Sometimes the analysis requires a very complicated combination of 10, 20, or 30 variables fitted through a time-based algorithm to give us predictions on what will happen in the next few seconds, or minutes, or even hours in terms of weather analysis.'

Like the deeper analysis that Stengard does back at the Austin Data Center, this real-time data management and near-real-time data analysis was done in Oracle Database 11g. 'We could download the data to servers on the tender ship, do some quick analysis, and feed it right back to the USA,' says Burns.

'We started to do better when the guys began using the instruments,' Burns says. 'Then we started to make small adjustments against the predictions and started to get improvements, and every day we were making gains.'

Those gains were incremental and data driven, and they accumulated over years—until the USA could sail at three times the wind speed. Ian Burns is still amazed by the spectacle.

'It’s an awesome thing to watch,' he says. 'Even with all we have learned, I don’t think we have met the performance limits of that beautiful wing.'

USA-17 pursues Alinghi 5 - Race 1, 33rd America’s Cup, Valencia. Her crew flew her off the instruments 'a pilot doesn’t fly a plane by looking out the window'. -  BMW Oracle Racing: Guilain Grenier   Click Here to view large photo

Read more about Oracle Data Mining

Hear a podcast interview with Ian Burns

Download Oracle Database 11g Release 2

Story republished from:

by Jeff Erickson Share   11:41 PM Sat 24 Apr 2010 GMT

Thursday Jul 14, 2011

Oracle Fusion Human Capital Management Application uses Oracle Data Mining for Workforce Predictive Analytics

Oracle's new Fusion Human Capital Management (HCM) Application now embeds predictive analytic models automatically generated by Oracle Data Mining to enrich dashboards and manager's portals with predictions about the likelihood that an employee with voluntarily leave the organization and a prediction about the employee's likely future performance. Armed with this new information that is based on historical patterns and relationships found by Oracle Data Mining, enterprises can more proactively manage their valuable employee assets and better compete. The integrated Oracle Fusion HCM Application requires the Oracle Data Mining Option to the Oracle Database. With custom predictive models generated using the customer's own data, Oracle Fusion HCM enables managers to better understand the employees, understand the key factors for each individual and even perform "What if?" analysis to see the likely impact on an employee by adjusting a critical HR factor e.g. bonus, vacation time, amount of travel, etc.

Excerpting from the Oracle Fusion HCM website and collateral: "Every day organizations struggle to answer essential questions about their workforce. How much money are we losing by not having the right talent in place and how is that impacting current projects? What skills will we need in the next 5 years that we don’t have today? How will business be impacted by impending retirements and are we prepared? Fragmented systems and bolt-on analytics are only some of the barriers that HR faces today. The consequences include missed opportunities, lost productivity, attrition, and uncontrolled operational costs. To address these challenges, Oracle Fusion Human Capital Management (HCM)puts information at your fingertips, helps you predict future trends, and enables you to turn insight into action. You will eliminate unnecessary costs, increase workforce productivity and retention, and gain a strategic advantage over your competition. Oracle Fusion HCM has been designed from the ground up so that you can work naturally and intuitively with analytics woven right into the fabric of your business processes."


This exceprt from the Solution Brief describes the Predictive Analytics features and benefits: "Every day organizations struggle to answer essential questions about their workforce. How much money are we losing by not having the right talent in place and how is that impacting current projects? What skills will we need in the next 5 years that we don’t have today? How will business be impacted by impending retirements and are we prepared? Fragmented systems and bolt-on analytics are only some of the barriers that HR faces today. The consequences include missed opportunities, lost productivity, attrition, and uncontrolled operational costs. To address these challenges, Oracle Fusion Human Capital Management (HCM) puts information at your fingertips, helps you predict future trends, and enables you to turn insight into action. You will eliminate unnecessary costs, increase workforce productivity and retention, and gain a strategic advantage over your competition. Oracle Fusion HCM has been designed from the ground up so that you can work naturally and intuitively with analytics woven right into the fabric of your business processes." ....

"Predictive Analysis Imagine if you could look ahead and be prepared for upcoming workforce trends. Most organizations do not have the analytic capability to do predictive human capital analysis, yet the worker information needed to make educated forecasts already exists today. Aging populations, shifting demographics, rising and falling economies, and multi-generational issues can have a significant impact on workforce decisions – for employees, managers and HR professionals. Not being able to accurately predict how all the moving parts fit together, and where you really have potential problems, can make or break an organization. Oracle Fusion HCM gives you the ability to finally see into the future, analyzing worker performance potential, risk of attrition, and enabling what-if analysis on ways to improve your workforce. Additionally, modeling capabilities provide you with extra power to bring together information from sources unthinkable in the past. For example, imagine understanding which recruiting agencies are providing the highest-quality recruits by comparing first year performance ratings with sources of hire. Being able to see potential problems before they occur and take immediate action will increase morale, save money, and boost your competitive edge. Result: You are able to look ahead and be prepared for upcoming workforce trends."

There is a great demo of Oracle Fusion HCM Workforce Predictive Analytics that highlights the Oracle Data Mining.  This is one of the latest examples of Applications "powered by Oracle Data Mining".


Employee grid

When you change your paradigm and move the algorithms to the data rather than the traditional approach of extracting the data and moving it to the algorithms for analysis, it CHANGES EVERYTHING. Keep watching for additional Applications powered by Oracle's in-database advanced analytics.


Everything about Oracle Data Mining, a component of the Oracle Advanced Analytics Option - News, Technical Information, Opinions, Tips & Tricks. All in One Place


« February 2016