Wednesday Feb 03, 2016

Links to Presentations: BIWA Summit'16 - Big Data + Analytics User Conference Jan 26-28, @ Oracle HQ Conference Center

We had a great www.biwasummit.org event with ~425 attendees, in depth technical presentations delivered by experts and even had several 2 hour Hands on Labs training classes that used the Oracle Database Cloud!  Watch for more coverage of event in various Oracle marketing and partner content venues.

Many thanks to all the BIWA board of directors and many volunteers who have put in so much work to make this BIWA Summit the best BIWA user event ever.  Mark your calendars for BIWA Summit’17, January 31, Feb. 1 & Feb. 2, 2017.  We’ll be announcing Call for Abstracts in the future, so please direct your best customers and speakers to submit.  We’re aiming to continue to make BIWA + Spatial + YesSQL Summit the best focused user gathering for sharing best practices for novel and interesting use cases of Oracle technologies.

BIWA is an IOUG SIG run by entirely by customers, partners and Oracle employee volunteers.  We’re always looking for people who would like to be involved.  Let me know if you’d like to contribute to the planning and organization of future BIWA events and activities.

See everyone at BIWA’17!

Charlie, on behalf of the entire BIWA board of directors  (charlie.berger@oracle.com)

(see www.biwasummit.org for more information)

See List of BIWA Summit'16 Presentations below.  Click on Details to access the speaker’s abstract and download the files (assuming the speaker has posted them for sharing).

We now have a schedule at a glance to show you all the sessions in a tabular agenda.

BIWASummit2016Tue.png

See bottom of page for the Session Search capability

Below is a list of the sessions and links to download most of the materials for the various sessions.  Click on the DETAILS button next to the session you want to download, then the page should refresh with the session description and (assuming the presenter uploaded files, but be aware that files may be limited to 5MB) you should see a list of files for that session.  See the full list below:

Advanced Analytics

Presentations (Click on Details to access file if submitted by presenter)

Dogfooding – How Oracle Uses Oracle Advanced Analytics To Boost Sales Efficiency

Details

Oracle Modern Manufacturing - Bridging IoT, Big Data Analytics and ERP for Better Results

Details

Predictive Modelling and Forecasting using OER

Details

Enabling Clorox as Data Driven Enterprise

Details

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold

Details

Large Scale Machine Learning with Big Data SQL, Hadoop and Spark

Details

Stubhub and Oracle Advanced Analytics

Details

Fiserv Case Study: Using Oracle Advanced Analytics for Fraud Detection in Online Payments

Details

Advanced Analytics for Call Center Operations

Details

Machine Learning on Streaming Data via Integration of Oracle R Enterprise and Oracle Stream Explorer

Details

Learn Predictive Analytics in 2 hours!! Oracle Data Miner 4.0 Hands on Lab

Details

Scaling R to New Heights with Oracle Database

Details

Predictive Analytics using SQL and PL/SQL

Details

Big Data Analytics with Oracle Advanced Analytics 12c and Big Data SQL and the Cloud

Details

Improving Predictive Model Development Time with R and Oracle Big Data Discovery

Details

Oracle R Enterprise 1.5 - Hot new features!

Details

Is Oracle SQL the best language for Statistics

Details

BI and Visualization

Presentations (Click on Details to access file if submitted by presenter)

Electoral fraud location in Brazilian General Elections 2014

Details

The State of BI

Details

Case Study of Improving BI Apps and OBIEE Performance

Details

Preparing for BI 12c Upgrade

Details

Data Visualization at Sound Exchange – a Case Study

Details

Integrating OBIEE and Essbase, Why it Makes Sense

Details

The Dash that changed a culture

Details

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database option

Details

Oracle Data Visualization vs. Answers: The Cage Match

Details

What's New With Oracle Business Intelligence 12c

Details

Workforce Analytics Leveraging Oracle Business Intelligence Cloud Serivces (BICS)

Details

Defining a Roadmap for Migrating to Oracle BI Applications on ODI

Details

See What’s There and What’s Coming with BICS & Data Visualization

Details

Free form Data Visualization, Mashup BI and Advanced Analytics with BI 12c

Details

Oracle Data Visualization Cloud Service Hands-On Lab with Customer Use Cases

Details

On Metadata, Mashups and the Future of Enterprise BI

Details

OBIEE 12c and the Leap Forward in Lifecycle Management

Details

Supercharge BI Delivery with Continuous Integration

Details

Visual Analyzer and Best Practices for Data Discovery

Details

BI Movie Magic: Maps, Graphs, and BI Dashboards at AMC Theatres

Details

Oracle Business Intelligence (OBIEE) the Smart View Way

Details

Big Data

Presentations (Click on Details to access file if submitted by presenter)

Oracle Big Data: Strategy and Roadmap

Details

Oracle Modern Manufacturing - Bridging IoT, Big Data Analytics and ERP for Better Results

Details

Leveraging Oracle Big Data Discovery to Master CERN’s Control Data

Details

Enrich, Transform and Analyse Big Data using Big Data Discovery and Visual Analyzer

Details

Oracle Big Data SQL: Unified SQL Analysis Across the Big Data Platform

Details

High Speed Video Processing for Big Data Applications

Details

Enterprise Data Hub with Oracle Exadata and Oracle Big Data Appliance

Details

How to choose between Hadoop, NoSQL or Oracle Database

Details

Analytical SQL in the Era of Big Data

Details

Cloud Computing

Presentations (Click on Details to access file if submitted by presenter)

Oracle DBaaS Migration Road Map

Details

Centralizing Spatial Data Management with Oracle Cloud Databases

Details

End Users data in BI - Data Mashup and Data Blending with BICS , DVCS and BI 12c

Details

Oracle BI Tools on the Cloud--On Premise vs. Hosted vs. Oracle Cloud

Details

Hybrid Cloud Using Oracle DBaaS: How the Italian Workers Comp Authority Uses Graph Technology

Details

Build Your Cloud with Oracle Engineered Systems

Details

Safe Passage to the CLOUD – Analytics

Details

Your Journey to the Cloud : From Dedicated Physical Infrastructure to Cloud Bursting

Details

Data Warehousing and ETL

Presentations (Click on Details to access file if submitted by presenter)

Getting to grips with SQL Pattern Matching

Details

Making SQL Great Again (SQL is Huuuuuuuuuuuuuuuge!)

Details

Controlling Execution Plans (without Touching the Code)

Details

Taking Full Advantage of the PL/SQL Result Cache

Details

Taking Full Advantage of the PL/SQL Compiler

Details

Advanced SQL: Working with JSON Data

Details

Oracle Database In-Memory Option Boot Camp: Everything You Need to Know

Details

Best Practices for Getting Started With Oracle Database In-Memory

Details

Extreme Data Warehouse Performance with Oracle Exadata

Details

Real-Time SQL Monitoring in Oracle Database 12c

Details

A Walk Through the Kimball ETL Subsystems with Oracle Data Integration

Details

MySQL 5.7 Performance: More Than 1.6M SQL Queries per Second

Details

Implement storage tiering in Data warehouse with Oracle Automatic Data Optimization

Details

Edition-Based Redefinition Case Study

Details

12-Step SQL Tuning Method

Details

Where's Waldo? Using a brute-force approach to find an Execution Plan the CBO hides

Details

Delivering an Enterprise-Wide Standard Chart of Accounts at GE with Oracle DRM

Details

Agile Data Engineering: Introduction to Data Vault Data Modeling

Details

Worst Practice in Data Warehouse Design

Details

Same SQL Plan, Different Performance

Details

Why Use PL/SQL?

Details

Transforming one table to another: SQL or PL/SQL?

Details

Understanding the 10053 Trace

Details

Analytic Views - Bringing Star Queries into the Twenty-First Century

Details

The Place of SQL in the Hybrid World

Details

The Next Generation of the Oracle Optimizer

Details

Internet of Things

Presentations (Click on Details to access file if submitted by presenter)

Oracle Modern Manufacturing - Bridging IoT, Big Data Analytics and ERP for Better Results

Details

Meet Your Digital Twin

Details

Industrial IoT and Machine Learning - Making Wind Energy Cost Competitive

Details

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold

Details

Big Data and the Internet of Things in 2016: Beyond the Hype

Details

IoT for Big Machines

Details

The State of Internet of Things (IoT)

Details

Oracle Spatial Summit

Presentations (Click on Details to access file if submitted by presenter)

Build Your Own Maps with the Big Data Discovery Custom Visualization Component

Details

Massively Parallel Calculation of Catchment Areas in Retail

Details

Dismantling Criminal Networks with Graph and Spatial Visualization and Analysis

Details

Best Practices for Developing Geospatial Apps for the Cloud

Details

Map Visualization in Analytic Apps in the Cloud, On-Premise, and Mobile

Details

Best Practices, Tips and Tricks with Oracle Spatial and Graph

Details

Delivering Smarter Spatial Data Management within Ordnance Survey, UK

Details

Deploying a Linked Data Service at the Italian National Institute of Statistics

Details

ATLAS - Utilizing Oracle Spatial and Graph with Esri for Pipeline GIS and Linear Asset Management

Details

Oracle Spatial 12c as an Applied Science for Solving Today's Real-World Engineering Problems

Details

Assembling a Large Scale Map for the Netherlands Using Oracle 12c Spatial and Graph

Details

Using Open Data Models to Rapidly Develop and Prototype a 3D National SDI in Bahrain

Details

Implementation of LBS services with Oracle Spatial and Graph and MapViewer in Zain Jordan

Details

Interactive map visualization of large datasets in analytic applications

Details

Gain Insight into Your Graph Data -- A hands on lab for Oracle Big Data Spatial and Graph

Details

Applying Spatial Analysis To Big Data

Details

Big Data Spatial: Location Intelligence, Geo-enrichment and Spatial Analytics

Details

What’s New with Spatial and Graph? Technologies to Better Understand Complex Relationships

Details

Graph Databases: A Social Network Analysis Use Case

Details

High Performance Raster Database Manipulation and Data Processing with Oracle Spatial and Graph

Details

3D Data Management - From Point Cloud to City Model

Details

The Power of Geospatial Visualization for Linear Assets Using Oracle Enterprise Asset Management

Details

Oracle Spatial and Graph: New Features for 12.2

Details

Fast, High Volume, Dynamic Vehicle Routing Framework for E-Commerce and Fleet Management

Details

Managing National Broadband Infrastructure at Turk Telekom with Oracle Spatial and Graph

Details

Other

Presentations (Click on Details to access file if submitted by presenter)

Taking Full Advantage of the PL/SQL Compiler

Details

Taking Full Advantage of the PL/SQL Result Cache

Details

Meet Your Digital Twin

Details

Making SQL Great Again (SQL is Huuuuuuuuuuuuuuuge!)

Details

Lightning Round for Vendors

Details


Wednesday Dec 16, 2015

BIWA's Got Talent YouTube Demo Contest! - Enter and Win $500!!!

Best Oracle "Tech Stack" YouTube Demo Contest!

BIWA Wants YOU (Customers, Partners, Oracle Employees, whatever--everyone!) to post on YouTube one or multiple YouTube videos that highlight BIWA focused Oracle technologies/products/features or anything BIWA related!   See #BIWASGOTTALENT

Contest Details

Two categories

  • Customers, Partners, Students, Friends of BIWA--Anyone!
  • Oracle Employees--Note:  Any concerns about eligibility for Oracle employees is the responsibility of the employee

Judges will award points per the following scheme--MAX 100 points

  • Maximum of 40 points: Perception of usefulness and value added to the BIWA community, user or company
  • Maximum 25 points (5 points each): Each Oracle product or major feature highlighted e.g. 5 points for OAA, 5 points for Spatial, 5 points for OBIEE, BDA, BDD, etc.
  • Maximum of 10 points:  Completeness and clarity of associated documentation, reusable code, etc.
  • Maximum of 15 points:  Intangibles e.g. cleverness, sizzle, coolness, etc.--whatever excites and moves the judges
  • Maximum of 10 points: Most "likes" on YouTube

Each YouTube recorded "live" entry must include:

  • BIWAS GOT TALENT with BIWA Summit 2016 Logo (above on this page)
  • Title of your YouTube Video
  • Author(s), titles and contact information
  • Include #BIWASGOTTALENT in the meta information on YouTube
  • When submitting on YouTube and send an email to biwasgottalent@gmail.com with a link
  • Presentation must be not to exceed more than 10 minutes of YouTube video. Submissions longer than 10 minute will be severely penalized by the judges. :(

The top two presentations in each category will be shown at BIWA Summit 2016 in Redwood Shores, California, January 26-28, 2016

Winners will be chosen based on a combination of the number of points received from the judges.  Submitters are encouraged to promote their #BIWASGOTTALENT video to accumulate "likes".  Prize can be taken as cash or donation to charity.

Rules, Regulations and Other Details:

  • By submitting your entry, you agree that BIWA may use your submission for marketing or other purposes
  • The winner will be notified by email by January 28th, 2016 and does not have to be present at BIWA Summit 2016 to win

For questions, please email biwasgottalent@gmail.com

Monday Oct 12, 2015

NHS Business Services Authority Gains Better Insight into Data, Identifies circa GBP100 Million (US$156 Million) in Potential Savings in Just Three Months

NHS Business Services Authority Gains Better Insight into Data, Identifies circa GBP100 Million (US$156 Million) in Potential Savings in Just Three Months

http://www.oracle.com/ocom/groups/public/@ocom/documents/digitalasset/2705694.gif

The NHS Business Services Authority (NHSBSA) is a special health authority and an arm’s length body of the Department of Health for England. It provides a range of critical central services to NHS organizations, contractors, patients, and the public. Services include managing the NHS Pension schemes in England and Wales, managing payments to primary care dental and pharmacy contractors, and administering the European Health Insurance Card (EHIC).

The NHS budget for 2015/16 is approximately GBP116 billion (US$179 billion) and the total funds administered by the NHSBSA (including those for the NHS Pension schemes) amount to circa GBP32 billion (US$48 billion). The Department of Health asked the NHSBSA to take a proactive role to identify opportunities to reduce costs and eliminate waste. One way to do this was to find better ways to use the vast volumes of data already collected and held within the organization to help reduce fraud and error throughout the health service.

The NHSBSA needed a new, centralized solution that would enable it to gain better value from its data which is spread across a disparate set of IT systems, data, storage, and analytical capabilities. To achieve this, it chose an end-to-end Oracle solution including Oracle Advanced AnalyticsOracle Exadata Database MachineOracle Exalytics In-Memory MachineOracle Endeca Information Discovery, and Oracle Business Intelligence Enterprise Edition.

With this Oracle solution, the NHSBSA established its Data Analytics Learning Laboratory (DALL), investing in both technology and expertise to create insight from its data. Within the first three months of operation, the organization identified circa GBP100 million (US$156 million) in potential savings.

Uncovering Savings in Dentistry

A word from NHS Business Services Authority

  • “Oracle Advanced Analytics’ data mining capabilities and Oracle Exalytics’ performance really impressed us. The overall solution is very fast, and our investment very quickly provided value. We can now do so much more with our data, resulting in significant savings for the NHS as a whole.” – Nina Monckton, Head of Information Services, NHS Business Services Authority

The NHSBSA used analytics to identify significant savings within NHS dental services and find instances of activities which do not demonstrate good value for money.

“With Oracle Advanced Analytics, it is much easier to detect anomalies in behaviors. We used anomaly detection to discover where there might be evidence of inappropriate behavior in dentists’ claims, enabling NHS commissioners to follow up and challenge their activities,” explained Nina Monckton, head of information services, NHSBSA.

Preventing Fraud for European Health Insurance Card

The EHIC is available to all European citizens covered by a statutory social security scheme and entitles them to free healthcare while visiting other European countries. 

During analysis of EHIC data, the NHSBSA discovered commercial addresses being used fraudulently to apply for EHIC cards and uncovered the use of invalid NHS and National Insurance numbers to apply for a card. 

“We used Oracle Exalytics and Oracle Business Intelligence for the EHIC application to improve the front-end validation process, prevent fraud, and blacklist addresses showing suspicious activities,” Monckton said.

Analyzing Billions of Records in Minutes

The NHSBSA receives data relating to more than one billion prescription items dispensed in primary care settings each year. Previously, the NHSBSA did not have the computing power to analyze this data at transaction level.

The NHSBSA can now analyze billions of records at one time, and by analyzing much larger sets of patient data, the NHSBSA can provide insight that is helping to improve standards of care throughout the health service.

“Previously, our information analysts did not have the ability to directly query data as it was mainly held in live operational systems. Now that we are able to transfer data to our Exadata environment, we have dramatically improved our ability to deliver value from our data,” Monckton said.

Analyzing Unstructured Text to Measure Satisfaction

Improving Data Matching To Save Millions of Dollars

In England, some people are entitled to free medical prescriptions or dental treatment from the NHS. The NHSBSA works with the Department of Work and Pensions (DWP) to establish that those patients declaring that they are exempt from a charge for dental treatment and/or medical prescriptions are claiming correctly. Using Oracle Exalytics to compare datasets, the NHSBSA reduced the rate of non-matching records for dentistry from 15% to just 5%.

The Role of Data Governance

Data is now moving to the heart of all NHSBSA programs. As a result of the organization’s new analytics capability, teams have a better understanding of what they can do with the data and are more careful about what data they collect. 

“We now know that if we collect the right data at the start of a program, we can measure what is working down the line. We are starting to change the culture of the organization around our data governance. There has been a massive shift. Data is now central to all our new programs, and data governance is at the heart of everything we do,” Monckton said.

Using the Data Analytics Learning Laboratory to Achieve Strategic Goals

The NHSBSA’s data analytics investment is helping the organization to achieve its 5 year strategic goals, which include helping to save GBP1 billion (US$1.56 billion) for NHS patients, reducing unit costs by 50%, improving service and delivering great results for customers, and deriving insight from data to drive change.

“With our newly established Data Lab in place, we can add even more value to the NHS. I cannot begin to describe how significant that has been. This project is really helping us to achieve our strategic goals. In addition, we are working in a different way now and it has even helped with how people interact and function in the workplace.

“We’ve had a very positive response, and our chief executive is extremely impressed with our achievements and the results we have shown so far. As a result, management is recommending that our suppliers and partners come to see what we are doing to learn from our experiences,” Monckton said.

Over the next six months, the DALL team has a large number of analytics projects in the pipeline and is looking to help other areas of the business to better leverage their data. The organization will focus on how it can use Oracle Business Intelligence Enterprise Edition with business users. In addition, the NHSBSA is investigating how it might share data and its analytical ability with other government organizations to drive further value from its investment.

Challenges

  • Use new insight gathered from data to help identify cost savings and meet NHSBSA strategic goals
  • Identify and prevent healthcare fraud and benefit eligibility errors to save costs
  • Leverage existing data to transform business and productivity

Solutions

Oracle Product and Services

  • Identified up to GBP100 million (US$156 million) that could potentially be saved across the NHS through benefit fraud and error reduction, by deploying new analytics infrastructure
  • Identified and implemented changes to prevent fraudulent European Health Insurance Card (EHIC) applications
  • Used data matching to identify savings that can be made through the recovery of money from patients claiming exemption from charges for dental treatment or prescriptions when not eligible to do so
  • Used anomaly detection to uncover fraudulent activity where some dentists split a single course of treatment into multiple parts and presented claims for multiple treatments
  • Analyzed unstructured text to measure employee satisfaction in more detail and found a direct link between those who felt less engaged at work and those more likely to take time off sick
  • Analyzed billions of records at one time to measure longer-term patient journeys and to analyze drug prescribing patterns to improve patient care
  • Established a new Data Analytics Learning Laboratory (DALL) that uses data and analytics to drive action and significant savings for the NHS
  • Implemented Oracle Advanced Analytics, Oracle Exadata Database Machine, Oracle Exalytics In-Memory Machine, Oracle Endeca Information Discovery, and Oracle Business Intelligence Enterprise Edition to deliver fast analysis and data mining for NHS and wider government departments

Why Oracle

“We chose Oracle because the solution could cope with very large data volumes running into billions of rows and could scale as volumes increase. In addition, the Oracle solution required no IT team support to run the queries, which enables our team of data analysts to be self-sufficient. Oracle Exalytics’ in-memory capability gave us the speed we required, and Oracle’s engineered systems accelerated deployment and reduced risk.

“Working with Oracle has been a very positive experience. The team has been incredibly responsive and provided a number of experts to help us get up and running as quickly as possible. With one vendor providing the whole solution, it’s very easy for us. If we need help, we know where to go,” Monckton said.

Implementation Process

Oracle ran a proof of concept (POC) to show the speed and capability of the proposed end-to-end solution. The POC used publically available data sets for NHS prescription data. It covered 50 million prescribed items, 300 million records, and six months of data. The team concentrated on finding anomalies in the data and carrying out further analysis to understand them before presenting the findings in a clear and straightforward way.

Following the POC, Oracle worked with NHSBSA and its data center partner, Capita, to complete the implementation. During implementation, Oracle provided the NHSBSA with access to a virtual environment. This enabled the team to get some experience with the tools before completing the implementation. As such, NHSBSA was familiar and confident with using the new analytics tools from day one, saving considerable time and gaining immediate value.

NHSBSA identified which data it should use for analysis and transferred it across to its Oracle Exadata environment. To date it has transferred more than 15 billion rows of data into Oracle Exadata. The prescription services database with 14 billion rows of data is the largest exported data source using 400 gigabytes. The export took 10 hours to complete with Oracle as the source database. 

Advice from NHSBSA

  • Have a clear plan for the first six months before you begin your implementation
  • Ensure you have buy-in from key stakeholders
  • Choose easy areas to start with, so you can demonstrate positive results quickly and prove the value of the solution to others
  • Build knowledge within your team through training and Oracle events; this helps staff to think differently about the possibilities of using data
  • Get help from the experts: talk to your existing suppliers, go to analytics events, and talk to other organizations who have implemented analytics
  • It’s never too early to think about data governance and data quality: recruit a data standards manager to create data governance policies and identify data leads around the business

Resources

Friday Oct 02, 2015

BIWA 2016 Here are some of our early accepted presentations!!

BIWA 2016 Early Bird Registration is open! Book NOW! Cost is $299! Book today and save $50!

Here are some of our early accepted presentations!!

“Big Data and the Internet of Things in 2016: Beyond the Hype”—Robert Stackowiak, Oracle

“Near Real Time Data Refresh Strategy for Enterprise Data Warehouses”—Richard Solari & Prithvi Krishnappa

“Predictive Analytics using SQL & PL/SQL”—Brendan Tierney, Oralytics

“On Metadata, Mashups & the Future of Enterprise BI”—Stewart Bryson, Red Pill Analytics

Call for Speakers

We want to hear your Oracle technology success story. We have some amazing presentations on the agenda. Add yours today! The Call for Speakers is on-going through November 2nd, 2015. Click HERE to submit your abstract(s) for BIWA Summit 2016.

Sponsorship

We would like to give your company great exposure through various sponsorship options such as:

· Hosting a conference networking event

· Sponsoring a meal for the attendees

· Keynotes

· Premium exhibit space

Sponsorships start as low as $1,000 and some include conference passes! Click HERE to learn more about how to sign-up today!

Friday Sep 25, 2015

Oracle Advanced Analytics at Oracle Open World 2015

While there are a lot of OOW talks that include the work “analytics” or “big data”, this is my short list of sessions, training and demos that primarily focus on Oracle Advanced Analytics. Hope to see you there!

Charlie 

Oracle Advanced Analytics at OOW'15 Highlights

Big Data Analytics with Oracle Advanced Analytics12c and Big Data SQL &
Fiserv Case Study: Fraud Detection in Online Payments [CON8743]

Tuesday, Oct 27, 5:15 p.m. | Moscone South—307

· Charles Berger, Sr. Director of Product Management, Advanced Analytics and Data Mining, Oracle

· Miguel M Barrera, Director of Risk Analytics and Strategy

· Julia Minkowski, Risk Analytics Manager

Oracle Advanced Analytics 12c delivers parallelized in-database implementations of data mining algorithms and integration with R. Data analysts use Oracle Data Miner GUI and R to build and evaluate predictive models and leverage R packages and graphs. Application developers deploy Oracle Advanced Analytics models using SQL data mining functions and R. Oracle extends Oracle Database to an analytical platform that mines more data and data types, eliminates data movement, and preserves security to automatically detect patterns, anticipate customer behavior, and deliver actionable insights. Oracle Big Data SQL adds new big data sources and Oracle R Advanced Analytics for Hadoop provides algorithms that run on Hadoop. 

Fiserv manage risk for $30B+ in transfers, servicing 2,500+ US financial institutions, including 27 of the top 30 banks and prevents $200M in fraud losses every year.  When dealing with potential fraud, reaction needs to be fast.  Fiserv describes their use of Oracle Advanced Analytics for fraud prevention in online payments and shares their best practices and results from turning predictive models into actionable intelligence and next generation strategies for risk mitigation.  
Conference Session

OAA Demo Pod (#3581—Big Data Predictive Analytics with Oracle Advanced Analytics, R, and Oracle Big Data SQL   Moscone South

The Oracle Advanced Analytics database option embeds powerful data mining algorithms in Oracle Database’s SQL kernel and adds integration with R for solving big data problems such as predicting customer behavior, anticipating churn, detecting fraud, and performing market basket analysis. Data analysts work directly with database data, using the Oracle Data Miner workflow GUI (SQL Developer 4.1 ext.), SQL, or R languages and can extend Oracle Advanced Analytics’ functionally with R graphics and CRAN packages. Oracle Big Data SQL enables Oracle Advanced Analytics models to run on Oracle Big Data Appliance. Oracle R Advanced Analytics for Hadoop provides a powerful R interface over Hadoop and Spark with parallel-distributed predictive algorithms. Learn more in this demo.

Real Business Value from Big Data and Advanced Analytics [UGF4519]

Sunday, Oct 25, 3:30 p.m. | Moscone South—301

· Antony Heljula, Technical Director, Peak Indicators Limited

· Brendan Tierney, Principal Consultant, Oralytics

Attend this session to hear real case studies where big data and advanced analytics have delivered significant return on investment to a variety of Oracle customers. These solutions can pay for themselves within one year. Customer case studies include predicting which employees are likely to leave within the next 12 months, predicting which sales outlets are likely to suffer from out-of-stock products, predicting sales based on the weather forecast, and predicting which students are likely to withdraw early from their courses. A live demonstration illustrates the high-level process for implementing predictive business intelligence (BI) and its best practices.  User Group Forum Session

Customer Panel: Big Data and Data Warehousing [CON8741]

Wednesday, Oct 28, 4:15 p.m. | Moscone South—301

· Craig Fryar, Head of Wargaming Business Intelligence, Wargaming.net

· Manuel Martin Marquez, Senior Research Fellow and Data Scientist, Cern Organisation Européenne Pour La Recherche Nucléaire

· Jake Ruttenburg, Senior Manager, Digital Analytics, Starbucks

· Chris Wones, Chief Enterprise Architect, 8451

· Reiner Zimmermann, Senior Director, DW & Big Data Global Leaders Program, Oracle

In this session, hear how customers around the world are solving cutting-edge analytical business problems using Oracle Data Warehouse and big data technology. Understand the benefits of using these technologies together, and how software and hardware combined can save money and increase productivity. Learn how these customers are using Oracle Big Data Appliance, Oracle Exadata, Oracle Exalytics, Oracle Database In-Memory 12c, or Oracle Analytics to drive their business, make the right decisions, and find hidden information. The conversation is wide-ranging, with customer panelists from a variety of industries discussing business benefits, technical architectures, implementation of best practices, and future directions.  Conference Session

End-to-End Analytics Across Big Data and Data Warehouse for Data Monetization [CON3296]

Monday, Oct 26, 4:00 p.m. | Moscone West—2022

· Satya Bhamidipati, Senior Principal Advanced Analytics Market Dev, Business Analytics Product Group, Oracle

· Gokula Mishra, VP, Big Data & Advanced Analytics, Oracle

Organizations have used data warehouses to manage structured and operational data, which provides business analysts with the ability to analyze key internal data and spot trends. However, the explosion of newer data sources (big data) not only challenges the role of the traditional data warehouse in analyzing data from these diverse sources but also exposes limitations posed by traditional software and hardware platforms. This newer data can be combined with the data in the data warehouse and analyzed without creating another data silo and creating a hybrid data analytics structure. This presentation discusses the data and analytics platform architecture that enables this data monetization and presents various industry use cases.  Conference Session

Building Predictive Models for Identifying and Preventing Tax Fraud [CON3294]

Wednesday, Oct 28, 9:00 a.m. | Park Central—Concordia

· Brian Bequette, Managing Partner, TPS

· Satya Bhamidipati, Senior Principal Advanced Analytics Market Dev, Business Analytics Product Group, Oracle

According to a TIGTA Audit Report issued in February 2013, in 2012 alone, the IRS identified almost 642,000 incidents of identity theft affecting tax administration, a 38 percent increase since 2010. And this number continues to increase. Tax Processing Systems (TPS) consultants have focused on fraud detection and developed innovative solutions and proprietary algorithms for detecting fraud. In 2012, TPS formed a partnership with Oracle and has adapted its cloud-based methodologies and algorithms for use on the Oracle technology stack. Together, TPS and Oracle have created an end-to-end fraud detection solution that is effective, efficient, and accurate. This presentation focuses on the technology and the algorithms they have developed to detect fraud.  Conference Session

Oracle University Pre-OOW Course – Sunday, Oct. 25th

Using Data Mining Techniques for Predictive Analysis Course, Sunday October 25th

This session teaches students the basic concepts of data mining and how to leverage the predictive analytical power of data mining with Oracle Database by using Oracle Data Miner 12c. Students will learn how to explore the data graphically, build and evaluate multiple data models, apply data mining models to new data, and deploy data mining's predications and insights throughout the enterprise. All this can be performed on the data in Oracle Database on a real-time basis by using Oracle Data Miner SQL APIs. As the data, models, and results remain in Oracle Database, data movement is eliminated, security is maximized, and information latency is minimized.
See Oracle University at Oracle OpenWorld and Make the Most of Your Oracle OpenWorld and JavaOne Experience with Preconference Training by Oracle Experts

When: Sunday, October 25, 2015, 9 a.m.-4 p.m., with a one-hour lunch break
Where: Golden Gate University, 536 Mission Street, San Francisco, CA 94105 (three blocks from Moscone Center)
Cost: US$850 for a full day of training (cost includes light refreshments and a boxed lunch)

Instructor: Ashwin Agarwal… Read full bio

Target Audience: Data scientists, application developers, and data analysts

Course Objectives:

  • Understand the basic concepts and describe the primary terminology of data mining
  • Understand the steps associated with a data mining process
  • Use Oracle Data Miner 12c to perform data mining
  • Understand the options for deploying data mining predictive results

Course Topics:

  • Understanding the Data Mining Concepts
  • Understanding the Benefits of Predictive Analysis
  • Understanding Data Mining Tasks
  • Key Steps of a Data Mining Process (Includes Demo)
  • Using Oracle Data Miner to Build, Evaluate, and Apply Multiple Data Mining Models Includes Demo)
  • Using Data Mining Predictions and Insights to Address Various Business Problems (Includes Demo)
  • Predicting Individual Behavior (Includes Demo)
  • Predicting Values (Includes Demo)
  • Finding Co-Occurring Events (Includes Demo)
  • Detecting Anomalies (Includes Demo)
  • Learning How to Deploy Data Mining Results for Real-Time Access by End Users

Prerequisites: A working knowledge of the SQL language and Oracle Database design and administration

Also, on the Big Data + Analytics related products OTN pages, there is a “Must See” Program Guide. Clicking on the .pdf link http://www.oracle.com/technetwork/database/openworld2015pdf-2650488.pdf you’ll see the full list.

Friday Aug 07, 2015

Oracle Advanced Analytics Oracle University (OU) Classes in Cambridge, MA. September 28-Oct. 1, 2015

Oracle University has rescheduled their 2 day back to back Oracle Advanced Analytics OU Classes in Cambridge, MA.   Please help spread the word. 

Oracle Advanced Analytics combo-course (ODM + ORE) training

This is great opportunity for big data analytics customers and partners to learn hands on about using Oracle Advanced Analytics.  Vlamis, authorized OU instructor(s), will be teaching the OAA/ODM & OAA/ORE courses again and have been a great and knowledgeable OAA training and implementation partner. The courses are also during the week of Predictive Analytics World in Boston (Oracle will be exhibiting and speaking) so perhaps a good time for customers to come to Boston, perhaps use some OU credits, learn some new skills and focus on Oracle’s predictive analytics. 

Anyone (customers and Oracle Employees) can register through us at http://www.vlamis.com/training/ or via their normal OU connections. They should be able to utilize OU training credits for either course.  Oracle Employees should register through the Employee Self Service from Self Service Applications

Please forward to any appropriate Oracle Advanced Analytics customers and partners.  Thanks!

Charlie 

Sunday Jul 26, 2015

Big Data Analytics with Oracle Advanced Analytics: Making Big Data and Analytics Simple white paper

Big Data Analytics with Oracle Advanced Analytics:

Making Big Data and Analytics Simple

Oracle White Paper  |  July 2014 

Executive Summary:  Big Data Analytics with Oracle Advanced Analytics

(Click HERE to read entire Oracle white paper)   (Click HERE to watch YouTube video)

The era of “big data” and the “cloud” are driving companies to change.  Just to keep pace, they must learn new skills and implement new practices that leverage those new data sources and technologies.  Increasing customer expectations from sharing their digital exhaust with corporations in exchange for improved customer interactions and greater perceived value are pushing companies forward.  Big data and analytics offer the promise to satisfy these new requirements.  Cloud, competition, big data analytics and next-generation “predictive” applications are driving companies towards achieving new goals of delivering improved “actionable insights” and better outcomes.  Traditional BI & Analytics approaches don’t deliver these detailed predictive insights and simply can’t satisfy the emerging customer expectations in this new world order created by big data and the cloud.

Unfortunately, with big data, as the data grows and expands in the three V’s; velocity, volume and variety (data types), new problems emerge.  Data volumes grow and data becomes unmanageable and immovable.  Scalability, security, and information latency become new issues.  Dealing with unstructured data, sensor data and spatial data all introduce new data type complexities.  

Traditional advanced analytics has several information technology inherent weak points: data extracts and data movement, data duplication resulting in no single-source of truth, data security exposures, separate and many times, depending on the skills of the data analysts/scientists involved, multiple analytical tools (commercial and open source) and languages (SAS, R, SQL, Python, SPSS, etc.).  Problems become particularly egregious during a deployment phase when the worlds of data analysis and information management collide.   

Traditional data analysis typically starts with a representative sample or subset of the data that is exported to separate analytical servers and tools (SAS, R, Python, SPSS, etc.) that have been especially designed for statisticians and data scientists to analyze data.  The analytics they perform range from simple descriptive statistical analysis to advanced, predictive and prescriptive analytics.  If a data scientist builds a predictive model that is determined to be useful and valuable, then IT needs to be involved to figure out deployment and enterprise deployment and application integration issues become the next big challenge. The predictive model(s)—and all its associated data preparation and transformation steps—have to be somehow translated to SQL and recreated inside the database in order to apply the models and make predictions on the larger datasets maintained inside the data warehouse.  This model translation phase introduces tedious, time consuming and expensive manual coding steps from the original statistical language (SAS, R, and Python) into SQL.  DBAs and IT must somehow “productionize” these separate statistical models inside the database and/or data warehouse for distribution throughout the enterprise.  Some vendors will charge for specialized products and options for just for predictive model deployment.  This is where many advanced analytics projects fail.  Add Hadoop, sensor data, tweets, and expanding big data reservoirs and the entire “data to actionable insights” process becomes more challenging.  

Not with Oracle.  Oracle delivers a big data and analytics platform that eliminates the traditional extract, move, load, analyze, export, move load paradigm.  With Oracle Database 12c and the Oracle Advanced Analytics Option, big data management and big data analytics are designed into the data management platform from the beginning.  Oracle’s multiple decades of R&D investment in developing the industry’s leading data management platform, Oracle SQL, Big Data SQL, Oracle Exadata, Oracle Big Data Appliance and integration with open source R are seamlessly combined and integrated into a single platform—the Oracle Database.  

Oracle’s vision is a big data and analytic platform for the era of big data and cloud to:

  • Make big data and analytics simple (for any data size, on any computer infrastructure and any variety of data, in any combination) and

  • Make big data and analytics deployment simple (as a service, as a platform, as an application)

Oracle Advanced Analytics offers a wide library of powerful in-database algorithms and integration with open source R that together can solve a wide variety of business problems and can be accessed via SQL, R or GUI.  Oracle Advanced Analytics, an option to the Oracle Database Enterprise Edition 12c, extends the database into an enterprise-wide analytical platform for data-driven problems such as churn prediction, customer segmentation, fraud and anomaly detection, identifying cross-sell and up-sell opportunities, market basket analysis, and text mining and sentiment analysis.  Oracle Advanced Analytics empowers data analyst, data scientists and business analysts to more extract knowledge, discover new insights and make informed predictions—working directly with large data volumes in the Oracle Database.   

Data analysts/scientists have choice and flexibility in how they interact with Oracle Advanced Analytics.  Oracle Data Miner is an Oracle SQL Developer extension designed for data analysts that provides an easy to use “drag and drop” workflow GUI to the Oracle Advanced Analytics SQL data mining functions (Oracle Data Mining).  Oracle SQL Developer is a free integrated development environment that simplifies the development and management of Oracle Database in both traditional and Cloud deployments. When Oracle Data Miner users are satisfied with their analytical methodologies, they can share their workflows with other analysts and/or generate SQL scripts to hand to their DBAs to accelerate model deployment.  Oracle Data Miner also provides a PL/SQL API for workflow scheduling and automation.  

R programmers and data scientists can use the familiar open source R statistical programming language console, RStudio or any IDE to work directly with data inside the database and leverage Oracle Advanced Analytics’ R integration with the database (Oracle R Enterprise).  Oracle Advanced Analytics’ Oracle R Enterprise provides transparent SQL to R translation to equivalent SQL and Oracle Data Mining functions for in-database performance, parallelism, and scalability—this making R ready for the enterprise.  

Application developers, using the ODM SQL data mining functions and ORE R integration can build completely automated predictive analytic solutions that leverage the strengths of the database and the flexibly of R to integrate Oracle Advanced Analytics analytical solutions into BI dashboards and enterprise applications.

By integrating big data management and big data analytics into the same powerful Oracle Database 12c data management platform, Oracle eliminates data movement, reduces total cost of ownership and delivers the fastest way to deliver enterprise-wide predictive analytics solutions and applications.  

(Click HERE to read entire Oracle white paper)

Friday Jul 24, 2015

2015 BIWA SIG Virtual Conference - Two Days of "Live" Talks by Experts - FREE

2015 BIWA SIG Virtual Conference

July 30-31, 2015 9:00 a.m. - 1:00 p.m. CDT

Join us for two full days where you will hear about the latest Business Intelligence trends. 

Day One:

  • 9:00 a.m. - 10:00 a.m.: What’s new in Oracle EPM and BI Infrastructure - Eric Helmer, ADI Strategies

Hyperion EPM abd BI Fusion edition is a dramatic change under the covers. Corporations must consider more globalapproaches to infrastructure to maintain availability and performance while reducing footprint and cost. Technologies such as Exalytics, Oracle virtualization, cloud computing, software as a service, etc and open source operating systems (Linux) are more commonplace. Join Oracle Are Director Eric Helmer as he covers what’s new, what’s supported, and what options you have when implementing your EPM/BI project.

  • 10:00 a.m. - 11:00 a.m.Italian Ministry of Labor & Social Policy -- A Journey to Digital Government - Nicola Sandoli, ICONSULTING

The Italian Ministry of Labor and Social Policy (MLPS) is a branch of the Italian government responsible for all labormatters, including employment policies, promotions, worker protection, and social security. In its evolution towards a digital government, MLPS is streamlining and simplifying its administrative processes. MLPS has embarked on a data-driven journey to redefine business models and interactions with citizens – and optimize and transform government services. MLPS is focusing on four areas: - Information delivery: transitioning its data warehouse platform from reporting to centralizing and certifying data - Business Intelligence: monitoring activities, web publishing, and analyzing socio–political impact - Web analytics and semantic intelligence: interacting more efficiently with citizens - Job-hunting online guidance services: real time answers to young people looking for jobs MLPS is using a wide range of Oracle technologies to manage large amounts of diverse data, and apply advanced analytics, including - Oracle Exalytics for daily updates of 5TB of data - Oracle Spatial and Graph and MapViewer 11g for location intelligence capabilities - Oracle Business Intelligence for desktop and mobile reporting - Oracle Endeca Information Discovery for web analytics, data discovery, and data analysis using social and semantic intelligence - Oracle Real-Time Decisions - Oracle Service-Oriented Architecture Suite: central point for accessing and managing information made available through the Ministry web portal Cliclavoro Learn more about MLPS and its innovative platform that is delivering better information and services to their constituents.

  • 11:00 a.m. - 12:00 p.m.Exadata:  Elastic Configurations and IaaS – Private Cloud - Amit Kanda, Oracle

Customers are faced with challenges in their business, which include taking real time data driven decisions and  reducing costs.  Exadata’s extreme performance combined with Database In-Memory answer the real time data driven decisions. Elastic configurations and an updated subscription model (IaaS – Private Cloud) for Exadata  hardware and software accompanied the launch of Exadata X5–2.  This presentation will describe these updates and how customers can start small with Exadata and grow Exadata with their business – making it easier to reach business objectives.

  • 12:00 p.m. - 1:00 p.m.The State of Internet of Things (IoT) - Shyam Varan Nath, GE

The Internet of Things or IoT is poised to have a tremendous amount of impact around us. This session will look at  the industry landscape of IoT. The different flavors of IoT will be discussed with use cases from the consumer,  commercial and industrial sectors. Learn about the edge and cloud computing platforms to power the IoT solutions.  Finally, walk-thru of use-cases that show how machine/sensor data is being monetized through analytics. Such use  cases will span Aviation and other industries.


Day Two:

  • 9:00 a.m. - 10:00 a.m.: Big Data Analytics with Oracle Advanced Analytics 12c and Big Data SQL - Charlie Berger, Oracle

Oracle Advanced Analytics 12c, delivers parallelized in-database implementations of data mining algorithms andintegration with R. Data analysts use Oracle Data Miner GUI and R to build and evaluate predictive models and leverage R packages and graphs. Application developers deploy OAA models using SQL data mining functions and R. Oracle extends the Database to an analytical platform that mines more data and more data types, eliminates data movement and preserves security to automatically detect patterns and anticipate customer behavior and deliver actionable insights. Oracle Big Data SQL adds new big data sources and ORAAH provides algorithms that run on Hadoop. Come learn what’s new, best practices, and hear customer examples.

  • 10:00 a.m. - 11:00 a.m.: Graph Data Management and Analytics for Big DataBill Beauregard, Oracle & Zhe Wu, Oracle

The newest Oracle big data product, Oracle Big Data Spatial and Graph, offers a set of spatial analytic services, and a graph database with rich graph analytics that support big data workloads on Apache Hadoop and NoSQL technologies. Oracle is applying over a decade of expertise with spatial and graph analytic technologies to big data architectures. Graphs are an important data model for big data systems. Property graphs can be used for discovery, for instance, to discover underlying communities and influencers within a social graph, relationships and connections in cyber security networks, and to generate recommendations based on interests, profiles, and past behaviors. Oracle Big Data Spatial and Graph provides optimized storage, search and querying in Oracle NoSQL Database and Apache HBase for distributed property graphs. It offers 35 built-in, in-memory, parallel property graph analytic functions. We will discuss use cases, features, architecture, and show a demo. Learn how developers and data scientists can manage their most challenging graph data processing in a single enterprise-class Big Data platform.

  • 11:00 a.m. - 12:00 p.m.Why Oracle Database In-Memory?  Use Cases and Overview - Andy Rivenes, Oracle

Oracle recently announced the availability of the Oracle Database In-Memory option, a memory-optimized database technology that transparently adds real-time analytics to applications. Because the In-Memory option is 100% compatible with existing Oracle Database applications, it’s easy to integrate it into your environment and to begin reaping the benefits. But how do you get started with it? What do you need to know to take full advantage of this new functionality? This session will give an overview of what Oracle Database In-Memory is and then discuss some use cases to highlight how it can be used.

| Register Here |


Wednesday Jul 15, 2015

Call for Abstracts at BIWA Summit'16 - The Oracle Big Data + Analytics User Conference


Please email shyamvaran@gmail.com with any questions regarding the submission process.

What Successes Can You Share?

We want to hear your story. Submit your proposal today for the Oracle BIWA Summit 2016.

Proposals will be accepted through Monday evening, November 2, 2015, at midnight, EST. Don’t wait, though—we’re accepting submissions on a rolling basis, so that selected sessions can be published early on our online agenda.

To submit your abstract, click here, select a track, fill out the form.

Please note:

  • Presentations must be noncommercial.
  • Sales promotions for products or services disguised as proposals will be eliminated. 
  • Speakers whose abstracts are accepted will be expected to submit (at a later date) a PowerPoint presentation slide set. 
  • Accompanying technical and use case papers are encouraged, but not required.

Speakers whose abstracts are accepted will be given a complimentary registration to the conference. (Any additional co-presenters must register for the event separately and provide appropriate registration fees. It is up to the co-presenters’ discretion which presenter to designate for the complimentary registration.) 

This Year’s Tracks

Proposals can be submitted for the following tracks: 

More About the Conference

The Oracle BIWA Summit 2016 is organized and managed by the Oracle BIWA SIG, the Oracle Spatial SIG, and the Oracle Northern California User Group. The event attracts top BI, data warehousing, analytics, Spatial, IoT and Big Data experts.

The three-day event includes keynotes from industry experts, educational sessions, hands-on labs, and networking events.

Hot topics include: 

  • Database, data warehouse and cloud, Big Data architecture
  • Deep dives and hands-on labs on existing Oracle BI, data warehouse, and analytics products
  • Updates on the latest Oracle products and technologies (e.g. Big Data Discovery, Oracle Visual Analyzer, Oracle Big Data SQL)
  • Novel and interesting use cases on everything – Spatial, Graph, Text, Data Mining, IoT, ETL, Security, Cloud
  • Working with Big Data (e.g., Hadoop, "Internet of Things,” SQL, R, Sentiment Analysis)
  • Oracle Business Intelligence (OBIEE), Oracle Big Data Discovery, Oracle Spatial, and Oracle Advanced Analytics—Better Together

Hope to see you at BIWA'16 in January, 2016!

Charlie

Monday May 04, 2015

Oracle Data Miner 4.1, SQL Developer 4.1 Extension Now Available!

To download, visit:  

http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index-097090.html

New Data Miner Features in SQL Developer 4.1

These new Data Miner 4.1 features are supported for database versions supported by Oracle Data Miner: 
JSON Data Support for Oracle Database 12.1.0.2 and above

In response to the growing popularity of JSON data and its use in Big Data configurations, Data Miner now provides an easy to use JSON Query node. The JSON Query node allows you to select and aggregate JSON data without entering any SQL commands. The JSON Query node opens up using all of the existing Data Miner features with JSON data. The enhancements include:

Data Source Node
o    Automatically identifies columns containing JSON data by identifying those with the IS_JSON constraint.
o    Generates JSON schema for any selected column that contain JSON data.
o    Imports a JSON schema for a given column.
o    JSON schema viewer.

Create Table Node
o    Ability to select a column to be typed as JSON.
o    Generates JSON schema in the same manner as the Data Source node.

JSON Data Type
o    Columns can be specifically typed as JSON data.

JSON Query Node (see related JSON node blog posting)
o    Ability to utilize any of the selection and aggregation features without having to enter SQL commands.
o    Ability to select data from a graphical layout of the JSON schema, making data selection as easy as it is with scalar relational data columns.
o    Ability to partially select JSON data as standard relational scalar data while leaving other parts of the same JSON document as JSON data.
o    Ability to aggregate JSON data in combination with relational data. Includes the Sub-Group By option, used to generate nested data that can be passed into mining model build nodes. 

General Improvements
o    Improved database session management resulting in less database sessions being generated and a more responsive user interface.
o    Filter Columns Node - Combined primary Editor and associated advanced panel to improve usability.
o    Explore Data Node - Allows multiple row selection to provide group chart display.
o    Classification Build Node - Automatically filters out rows where the Target column contains NULLs or all Spaces. Also, issues a warning to user but continues with Model build.
o    Workflow - Enhanced workflows to ensure that Loading, Reloading, Stopping, Saving operations no longer block the UI.
o    Online Help - Revised the Online Help to adhere to topic-based framework.

Selected Bug Fixes (does not include 4.0 patch release fixes)
o    GLM Model Algorithm Settings: Added GLM feature identification sampling option (Oracle Database 12.1 and above).
o    Filter Rows Node: Custom Expression Editor not showing all possible available columns.
o    WebEx Display Issues: Fixed problems affecting the display of the Data Miner UI through WebEx conferencing.


For More Information and Support, please visit the Oracle Data Mining Discussion Forum on the Oracle Technology Network (OTN)

Return to Oracle Data Miner page on OTN

Wednesday Apr 22, 2015

OpenWorld 2015 Call for Proposals Extended to Wed, May 6th, 11:59 p.m

OpenWorld 2015 Call for Proposals Extended to Wed, May 6th, 11:59 p.m https://www.oracle.com/openworld/call-for-proposals.html Submit your Oracle Advanced Analytics stories now

If you’re an Oracle technology expert, conference attendees want to hear it straight from you. So don’t wait—proposals must be submitted by April 29.

Wanted: Outstanding Oracle Experts

The Oracle OpenWorld 2015 Call for Proposals is now open. Attendees at the conference are eager to hear from experts on Oracle business and technology. They’re looking for insights and improvements they can put to use in their own jobs: exciting innovations, strategies to modernize their business, different or easier ways to implement, unique use cases, lessons learned, the best of best practices.

If you’ve got something special to share with other Oracle users and technologists, they want to hear from you, and so do we. Submit your proposal now for this opportunity to present at Oracle OpenWorld, the most important Oracle technology and business conference of the year.

We recommend you take the time to review the General Information, Submission Information, Content Program Policies, and Tips and Guidelines pages before you begin. We look forward to your submissions.


Submit Your Proposal

By submitting a session for consideration, you authorize Oracle to promote, publish, display, and disseminate the content submitted to Oracle, including your name and likeness, for use associated with the Oracle OpenWorld and JavaOne San Francisco 2015 conferences. Press, analysts, bloggers and social media users may be in attendance at OpenWorld or JavaOne sessions.


General Information

  • Conference location: San Francisco, California, USA
  • Dates: Sunday, October 25 to Thursday, October 29, 2015
  • Website: Oracle OpenWorld

Key Dates for 2015

Deliverables Due Dates
Call for Proposals—Open Wednesday, March 25
Call for Proposals—Closed Wednesday, April 29, 11:59 p.m. PDT
Notifications for accepted and declined submissions sent Mid-June

Contact us

  • For questions regarding the Call for Proposals, send an e-mail to speaker-services_ww@oracle.com.
  • For technical questions about the submission tool or issues with submitting your proposal, send an e-mail to OpenWorldContent@gpj.com.
  • Oracle employee submitters should contact the appropriate Oracle track leads before submitting. To view a list of track leads, click here.

Saturday Mar 28, 2015

Use Repository APIs to Manage and Schedule Workflows to run

Data Miner 4.1 ships with a set of repository PL/SQL APIs that allow applications to manage Data Miner projects and workflows directly. The workflow APIs enable applications to execute workflows immediately or schedule workflows to execute using specific time intervals or using defined schedules. The workflow run APIs internally use Oracle Scheduler for scheduling functionality. Moreover, repository views are provided for applications to query project and workflow information. Applications can also monitor workflow execution status and query generated results using these views.

With the workflow APIs, applications can seamlessly integrate the workflow running process.  Moreover, all generated results are accessible by the Data Miner, so you can view the results using the Data Miner user interface.

For more information, please read the White Paper Use Repository APIs to Manage and Schedule Workflows to run

Monday Dec 15, 2014

Use Oracle Data Miner to Perform Sentiment Analysis inside Database using Twitter Data Demo

Sentiment analysis has been a hot topic recently; sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials.  Social media websites are good source of people sentiments.  Companies have been using social networking sites to make new product announcements, promote their products, collect product reviews and user feedback, interact with their customers, etc.  It is important for companies to sense customer sentiments toward their products, so they can react accordingly to benefit from customers’ opinion.

In this blog, we will show you how to use Data Miner to perform some basic sentiment analysis (based on text analytics) using Twitter data.  The demo data was downloaded from the developer API console page of the Twitter website.  The data itself originated from the Oracle Twitter page, and it contains about a thousand tweets posted in the past six months (May to Oct 2014).  We will determine the sentiments (highly favored, moderately favored, and less favored) of tweets based on their favorite counts, and assign the sentiment to each tweet.  We then build classification models using these tweets along with their assigned sentiments.  The goal is to predict how well a new tweet will be received by customers.  This may help marketing department to better craft a tweet before it is posted.

The demo (click here to download demo twitter data and workflow) will use the newly added JSON Query node in the Data Miner 4.1 to import the twitter data; please review the “How to import JSON data to Data Miner for Mining” blog entry in previous post.

Workflow for Sentiment Analysis

The following workflow shows the process we use to prepare the twitter data, determine the sentiments of tweets, and build classification models on the data.

The following describes the nodes used in the above workflow:

  • Data Source (TWITTER_LARGE)
    • Select the demo Twitter data source.  The sample Twitter data is attached with this blog.
  • JSON Query (JSON Query)
    • Select the required JSON attributes used for analysis; we only use the “id”, “text”, and “favorite_count” attributes.  The “text” attribute contains the tweet, and the “favorite_count” attribute indicates how many times the tweet has been favorited.
  • SQL Query (Cleanse Tweets)
    • Remove shorten URLs and punctuations within tweets because these data contain no predictive information.
  • Filter Rows (Filter Rows)
    • Remove retweeted tweets because these are duplicate tweets.
  • Transform (Transform)
    • Perform quantile bin of the “favorite_count” data into three quantiles; each quantile represent a sentiment.  The top quantile represents “highly favored” sentiment, the middle quantile represents “moderately favored” sentiment, and the bottom quantile represents “less favored” sentiment.
  • SQL Query (Recode Sentiment)
    • Assign quantiles as determined sentiments to tweets.
  • Create Table (OUTPUT_4_29)
    • Persist the data to a table for classification model build (optional).
  • Classification (Class Build)
    • Build classification models to predict customer sentiment toward a new tweet (how much will customer like this new tweet?).

Data Source Node (TWITTER_LARGE)

Select the JSON_DATA in the TWITTER_LARGE table.  The JSON_DATA contains about a thousand tweets to be used for sentiment analysis.

JSON Query Node (JSON Query)

Use the new JSON Query node to select the following JSON attributes.  This node projects the JSON data to relational data format, so that it can be consumed within the workflow process.

SQL Query Node (Cleanse Tweets)

Use the REGEXP_REPLACE function to remove numbers, punctuations, and shorten URLs inside tweets because these data are considered noises and do not provide any predictive information.  Notice we do not treat hash tags inside tweets specially; these tags are treated as regular words.

We specify the number, punctuation, and URL patterns in regular expression syntax and use the database function REGEXP_REPLACE to replace these patterns inside all tweets with empty spaces.

SELECT
REGEXP_REPLACE("JSON Query_N$10055"."TWEET", '([[:digit:]*]|[[:punct:]*]|(http[s]?://(.*?)(\s|$)))', '', 1, 0) "TWEETS",
"JSON Query_N$10055"."FAVORITE_COUNT",
"JSON Query_N$10055"."ID"
FROM
"JSON Query_N$10055"

Filter Rows Node (Filter Rows)

Remove retweeted tweets because these are duplicate tweets.  Usually, retweeted tweets start with a “RT” abbreviate, so we specify the following row filter condition to filter out those tweets.

Transform Node (Transform)

Use the Transform node to perform quantile bin of the “favorite_count” data into three quantiles; each quantile represent a sentiment.  For simplicity, we just bin the count into three quantiles without applying any special treatment first.

SQL Query Node (Recode Sentiment)

Assign quantiles as determined sentiments to tweets; top quantile represents “highly favored” sentiment, the middle quantile represents “moderately favored” sentiment, and the bottom quantile represents “less favored”.  These sentiments become target classes for the classification model build.

Classification Node (Class Build)

Build Classification models using the sentiment as target and tweet id as case id.

Since the TWEETS column contains the textual tweets, so we change the mining type to Text Custom.

Enable the Stemming option for text processing.

Compare Test Results

After the model build completes successfully, open the test viewer to compare model test results, the SVM model seems to produce the best prediction for the “highly favored” sentiment (57% correct prediction).

Moreover, the SVM model has better lift result than other models, so we will use this model for scoring.

Sentiment Prediction (Scoring)

Let’s score this tweet “this is a boring tweet!” using the SVM model.

As expected, this tweet receives a “less favored” prediction.

How about this tweet “larry is doing a data mining demo now!” ?

Not surprisingly, this tweet receives a “highly favored” prediction.

Last but not least, let’s see the sentiment prediction for the title of this blog

Not bad it gets a “highly favored” prediction, so it seems this title will be well received by audience.

Conclusion

The best SVM model only produces 57% accuracy for the “highly favored” sentiment prediction, but it is reasonably better than random guess.  For a larger sample of tweet data, the model accuracy could be improved.  With the new JSON Query node, it enables us to perform data mining on JSON data which is the most popular data format produced by prominent social networking sites.

Monday Dec 08, 2014

How to import JSON data to Data Miner for Mining

JSON is a popular lightweight data structure used by Big Data. Increasingly, a lot of data produced by Big Data are in JSON format. For example, web logs generated in the middle tier web servers are likely in JSON format. NoSQL database vendors have chosen JSON as their primary data representation. Moreover, the JSON format is widely used in the RESTful style Web services responses generated by most popular social media websites like Facebook, Twitter, LinkedIn, etc. This JSON data could potentially contain wealth of information that is valuable for business use. So it is important that we can bring this data over to Data Miner for analysis and mining purposes.

Oracle database 12.1.0.2 provides ability to store and query JSON data. To take advantage of the database JSON support, the upcoming Data Miner 4.1 added a new JSON Query node that allows users to query JSON data as relational format. In additional, the current Data Source node and Create Table node are enhanced to allow users to specify JSON data in the input data source.

In this blog, I will show you how to specify a JSON data in the input data source and use JSON Query node to selectively query desirable attributes and project the result in relational format. Once the data is in relational format, users can treat it as a normal relational data source and start analyzing and mining it immediately. The Data Miner repository installation installs a sample JSON dataset ODMR_SALES_JSON_DATA, which I will be using it here. However, Oracle Big Data SQL supports queries against vast amounts of big data stored in multiple data sources, including Hadoop. Users can view and analyze data from various data stores together, as if it were all stored in an Oracle database.

Specify JSON Data

The Data Source node and Create Table nodes are enhanced to allow users to specify the JSON data type in the input data source.

Data Source Node

For this demo, we will focus on the Data Source node. To specify JSON data, create a new workflow with a Data Source node. In the Define Data Source wizard, select the ODMR_SALES_JSON_DATA table. Notice there is only one column (JSON_DATA) in this table, which contains the JSON data.

Click Next to go to the next step where it shows the JSON_DATA is selected with the JSON(CLOB) data type. The JSON prefix indicates the data stored is in JSON format; the CLOB is the original data type. The JSON_DATA column is defined with the new “IS JSON” constraint, which indicates only valid JSON document can be stored there. The UI can detect this constraint and automatically select the column as JSON type. If there was not a “IS JSON” constraint defined, the column would be shown with a CLOB data type. To manually designate a column as a JSON type, click on the data type itself to bring up a in-place dropdown where it lists the original data type (e.g. CLOB) and a corresponding JSON type (e.g. JSON(CLOB)), so just select the JSON type. Note: only the following data types can be set to JSON type: VARCHAR2, CLOB, BLOB, RAW, NCLOB, and NVARCHAR2.

Click Finish and run the node now.

Once the node is run successfully, open the editor to examine the generated JSON schema.

Notice the message “System Generated Data Guide is available” at the bottom of the Selected Attributes listbox. What happens here is when the Data Source node is run, it parsed the JSON documents to produce a schema that represents the document structure. Here is what the schema looks like:

PATH

TYPE

$."CUST_ID"

NUMBER

$."EDUCATION"

STRING

$."OCCUPATION"

STRING

$."HOUSEHOLD_SIZE"

STRING

$."YRS_RESIDENCE"

STRING

$."AFFINITY_CARD"

STRING

$."BULK_PACK_DISKETTES"

STRING

$."FLAT_PANEL_MONITOR"

$."HOME_THEATER_PACKAGE"

$."BOOKKEEPING_APPLICATION"

$."PRINTER_SUPPLIES"

$."Y_BOX_GAMES"

$."OS_DOC_SET_KANJI"

$."COMMENTS"

$."SALES"

$."SALES"."PROD_ID"

$."SALES"."QUANTITY_SOLD"

$."SALES"."AMOUNT_SOLD"

$."SALES"."CHANNEL_ID"

$."SALES"."PROMO_ID"

STRING

STRING

STRING

STRING

STRING

STRING

STRING

ARRAY

NUMBER

NUMBER

NUMBER

NUMBER

NUMBER

The JSON Path expression syntax and associated data type info (OBJECT, ARRAY, NUMBER, STRING, BOOLEAN, NULL) are used to represent JSON document structure. We will refer to this JSON schema as Data Guide throughout the product.

Before we look at the Data Guide in the UI, let’s look at the settings that can affect how it is generated. Click the “JSON Settings…” button to open the JSON Parsing Settings dialog.

The settings are described below:

· Generate Data Guide if necessary

o Generate a Data Guide if it is not already generated in parent node.

· Sampling

o Sample JSON documents for Data Guide generation.

· Max. number of documents

o Specify maximum number of JSON documents to be parsed for Data Guide generation.

· Limit Document Values to Process

o Sample JSON document values for Data Guide generation.

· Max. number per document

o Specify maximum number of JSON document scalar values (e.g. NUMBER, STRING, BOOLEAN, NULL) per document to be parsed for Data Guide generation.

The sampling option is enabled by default to prevent long-running parsing of JSON documents; parsing could take a while for large number of documents. However, users may supply a Data Guide (Import from File) or reuse an existing Data Guide (Import from Workflow) if compatible Data Guide is available.

Now let’s look at the Data Guide, go back to the Edit Data Source Node dialog, select the JSON_DATA column and click the above to open the Edit Data Guide dialog. The dialog shows the JSON structure in a hierarchical tree view with data type information. The “Number of Values Processed” shows the total number of JSON scalar values was parsed to produce the Data Guide.

Users can control whether to enable Data Guide generation or import a compatible Data Guide via the menu under the icon.

The menu options are described below:

· Default

o Use the “Generate Data Guide if necessary” setting found in the JSON Parsing Setting dialog (see above).

· On

o Always generate a Data Guide.

· Off

o Do not generate a Data Guide.

· Import From Workflow

o Import a compatible Data Guide from a workflow node (e.g. Data Source, Create Table). The option will be set to Off after the import (disable Data Guide generation).

· Import From File

o Import a compatible Data Guide from a file. The option will be set to Off after the import (disable Data Guide generation).

Users can also export the current Data Guide to a file via the icon.

Select JSON Data

In Data Miner 4.1, a new JSON Query node is added to allow users to selectively bring over desirable JSON attributes as relational format.

JSON Query Node

The JSON Query node is added to the Transforms group of the Workflow.

Let’s create a JSON Query node and connect the Data Source node to it.

Double click the JSON Query node to open the editor. The editor consists of four tabs, and these tabs are described as followings:

· JSON

The Column dropdown lists all available columns in the data source where JSON structure (Data Guide) is found. It consists of the following two sub tabs:

o Structure

o Show the JSON structure of the selected column in a hierarchical tree view.

o Data

o Show sample of JSON documents found in the selected column. By default it displays first 2,000 characters (including spaces) of the documents. Users can change the sample size (max. 50,000 chars) and run the query to see more of the documents.

· Addition output

o Allow users to select any non-JSON columns in the data source as additional output columns.

· Aggregation

o Allow users to define aggregations of JSON attributes.

· Preview

o Output Columns

o Show columns in the generated relational output.

o Output Data

o Show data in the generated relational output.

JSON Tab

Let’s select some JSON attributes to bring over. Skip the SALES attributes because we want to define aggregations for these attributes (QUANTITY_SOLD and AMOUNT_SOLD).

To peek at the JSON documents, go to the Data tab. You can change the Sample Size to look at more JSON data. Also, you can search for specific data within the displayed documents by using the search control.

Addition Output Tab

If you have any non-JSON columns in the data source that you want to carry over for output, you can select those columns here.

Aggregate Tab

Let’s define aggregations (use SUM function) for QUANTITY_SOLD and AMOUNT_SOLD attributes (within the SALES array) for each customer group (group by CUST_ID).

Click the icon in the top toolbar to open the Edit Group By dialog, where you can select the CUST_ID as the Group-By attribute. Notice the Group-By attribute can consists of multiple attributes.

Click OK to return to the Aggregate tab, where you can see the selected CUST_ID Group-By attribute is now added to the Group By Attributes table at the top.

Click the icon in the bottom toolbar to open the Add Aggregations dialog, where you can define the aggregations for both QUANTITY_SOLD and AMOUNT_SOLD attributes using the SUM function.

Next, click the icon in the toolbar to open the Edit Sub Group By dialog, where you can specify a Sub-Group By attribute (PROD_ID) to calculate quantity sold and amount sold per product per customer.

Specifying a Sub-Group By column creates a nested table; the nested table contains columns with data type DM_NESTED_NUMERICALS.

Click OK to return to the Aggregate tab, where you can see the defined aggregations are now added to the Aggregation table at the bottom.

Preview Tab

Let’s go to the Preview tab to look at the generated relational output. The Output Columns tab shows all output columns and their corresponding source JSON attributes. The output columns can be renamed by using the in-place edit control.

The Output Data tab shows the actual data in the generated relational output.

Click OK to close the editor when you are done. The generated relational output is single-record case format; each row represents a case. If we had not defined the aggregations for the JSON array attributes, the relational output would have been in multiple-record case format. The multiple-record case format is not suitable for building mining models except for Association model (which accepts transactional data format with transaction id and item id).

Use Case

Here is an example of how JSON Query node is used to project the JSON data source to relational format, so that the data can be consumed by Explore Data node for data analysis and Class Build node for building models.

Conclusion

This blog shows how JSON data can be brought over to Data Miner via the new JSON Query node. Once the data is projected to relational format, it can easily be consumed by Data Miner for graphing, data analysis, text processing, transformation, and modeling.

Thursday Nov 20, 2014

ORACLE BI, DW, ANALYTICS, BIG DATA AND SPATIAL USER COMMUNITY - BIWA Summit'15 www.biwasummit.org

Please share with your Oracle BI, DW, Analytics, big Data and Spatial User coMMUNITY.   THANKS.  CB

BIWA Summit’15 Jan 27-29, 2015 Early Bird Registration Ends Friday. 

Registration is now LIVE. Register by November 21st (tomorrow) to receive the early bird pricing of $249 and save $50.

Please direct your colleagues to REGISTER NOW and participate to take advantage of the Early Bird registration ($249.00 USD).  EARLY BIRD SPECIAL ENDS TOMORROW (Friday, Nov. 21).  Here’s some information about the event below and some pics and talks from last year to give some feel for the opportunity.   

BIWA Summits have been organized and managed by the Oracle BI, DW and Analytics SIG user community of IOUG (Independent Oracle User Group) and attract the top Oracle BI, DW and Advanced Analytics and Big Data experts. The 2.5-day BIWA Summit'15 event joins forces with the Oracle Spatial SIG and involves Keynotes by Industry experts, Educational sessions, Hands-on Labs and networking events. We have a great line up so far w/ Tom Kyte Senior Technical Architect in Oracles Server Technology, Doug Cutting (Chief Architect, Cloudera), Oracle BI Senior Management, Neil Mendelson, VP of Product Management Big Data and Advanced Analytics, Matt Bradley, SVP, Oracle Product Development, EPM Applications, other features speakers, and many customers/tech experts (see web site and search % Sessions). Our BIWA Summit offers a broad, multi-track user driven conference that has built up a growing reputation over the years. We emphasize technical content and networking with like minded customers, users, developers, product managers (Database, Big Data Appliance, Oracle Advanced Analytics, Spatial, OBIEE, Endeca, Big Data Discovery, In-Memory, SQL Patterns, etc.), etc. who all share an interest in “novel and interesting use cases” of Oracle BI, DW, Advanced Analytics and Spatial technologies, applications and solutions. We’re off to a great start this year with a great agenda and hope to pack the HQ CC this Jan 27-29, 2015 with 300+ attendees.

Please forward and share with your Oracle BI, DW, Analytics, Big Data and Spatial colleagues.   

Thank you!  Hope to see you at BIWA Summit'15

Charlie

Wednesday Oct 08, 2014

2014 was a very good year for Oracle Advanced Analytics at Oracle Open World 2014

2014 was a very good year for Oracle Advanced Analytics at Oracle Open World 2014.   We had a number of customer, partner and Oracle talks that focused on the Oracle Advanced Analytics Database Option.    See below with links to presentations.  Check back later to OOW Sessions Content Catalog as not all presentations have been uploaded yet.  :-(

Big Data and Predictive Analytics: Fiserv Data Mining Case Study [CON8631]

Moving data mining algorithms to run as native data mining SQL functions eliminates data movement, automates knowledge discovery, and accelerates the transformation of large-scale data to actionable insights from days/weeks to minutes/hours. In this session, Fiserv, a leading global provider of electronic commerce systems for the financial services industry, shares best practices for turning in-database predictive models into actionable policies and illustrates the use of Oracle Data Miner for fraud prevention in online payments. Attendees will learn how businesses that implement predictive analytics in their production processes significantly improve profitability and maximize their ROI.

Developing Relevant Dining Visits with Oracle Advanced Analytics at Olive Garden [CON2898]

Olive Garden, traditionally managing its 830 restaurants nationally, transitioned to a localized approach with the help of predictive analytics. Using k-means clustering and logistic classification algorithms, it divided its stores into five behavioral segments. The analysis leveraged Oracle SQL Developer 4.0 and Oracle R Enterprise 1.3 to evaluate 115 million transactions in just 5 percent the time required by the company’s BI tool. While saving both time and money by making it possible to develop the solution internally, this analysis has informed Olive Garden’s latest remodel campaign and continues to uncover millions in profits by optimizing pricing and menu assortment. This session illustrates how Oracle Advanced Analytics solutions directly affect the bottom line.

A Perfect Storm: Oracle Big Data Science for Enterprise R and SAS Users [CON8331]

With the advent of R and a rich ecosystem of users and developers, a myriad of bloggers, and thousands of packages with functionality ranging from social network analysis and spatial data analysis to empirical finance and phylogenetics, use of R is on a steep uptrend. With new R tools from Oracle, including Oracle R Enterprise, Oracle R Distribution, and Oracle R Advanced Analytics for Hadoop, users can scale and integrate R for their enterprise big data needs. Come to this session to learn about Oracle’s R technologies and what data scientists from smart companies around the world are doing with R.

Extending the Power of In-Database Analytics with Oracle Big Data Appliance [CON2452]

The need for speed could not be greater—not speed of processing but time to market. The problem is driven by the long journey data takes before evolving into insight. Insight, however, is always relative to assumption. In fact, analytics is often seen as a battle between assumption and data. Assumptions can be classified into three types: related to distributions, ratios, and relations. In this session, you will see how the most-valuable business insights can come in the matter of hours, not months, when assumptions are challenged with data. This is made possible by the integration of Oracle Big Data Appliance, enabling transparent access to in-database analytics from the data warehouse and avoiding the traditional long journey of data to insight.

Market Basket Analysis at Dunkin’ Brands [CON6545]

With almost 120 years of franchising experience, Dunkin’ Brands owns two of the world’s most recognized, beloved franchises: Dunkin’ Donuts and Baskin-Robbins. This session describes a market basket analysis solution built from scratch on the Oracle Advanced Analytics platform at Dunkin’ Brands. This solution enables Dunkin’ to look at product affinity and a host of associated sales metrics with a view to improving promotional effectiveness and cross-sell/up-sell to increase customer loyalty. The presentation discusses the business value achieved and technical challenges faced in scaling the solution to Dunkin’ Brands’ transaction volumes, including engineered systems (Oracle Exadata) hardware and parallel processing at the core of the implementation.

Predictive Analytics with Oracle Data Mining [CON8596]

This session presents three case studies related to predictive analytics with the Oracle Data Mining feature of Oracle Advanced Analytics. Service contracts cancellation avoidance with Oracle Data Mining is about predicting the contracts at risk of cancellation at least nine months in advance. Predicting hardware opportunities that have a high likelihood of being won means identifying such opportunities at least four months in advance to provide visibility into suppliers of required materials. Finally, predicting cloud customer churn involves identifying the customers that are not as likely to renew subscriptions as others.

SQL Is the Best Development Language for Big Data [CON7439]

SQL has a long and storied history. From the early 1980s till today, data processing has been dominated by this language. It has changed and evolved greatly over time, gaining features such as analytic windowing functions, model clauses, and row-pattern matching. This session explores what's new in SQL and Oracle Database for exploiting big data. You'll see how to use SQL to efficiently and effectively process data that is not stored directly in Oracle Database.

Advanced Predictive Analytics for Database Developers on Oracle [CON7977]

Traditional database applications use SQL queries to filter, aggregate, and summarize data. This is called descriptive analytics. The next level is predictive analytics, where hidden patterns are discovered to answer questions that give unique insights that cannot be derived with descriptive analytics. Businesses are increasingly using machine learning techniques to perform predictive analytics, which helps them better understand past data, predict future trends, and enable better decision-making. This session discusses how to use machine learning algorithms such as regression, classification, and clustering to solve a few selected business use cases.

What Are They Thinking? With Oracle Application Express and Oracle Data Miner [UGF2861]

Have you ever wanted to add some data science to your Oracle Application Express applications? This session shows you how you can combine predictive analytics from Oracle Data Miner into your Oracle Application Express application to monitor sentiment analysis. Using Oracle Data Miner features, you can build data mining models of your data and apply them to your new data. The presentation uses Twitter feeds from conference events to demonstrate how this data can be fed into your Oracle Application Express application and how you can monitor sentiment with the native SQL and PL/SQL functions of Oracle Data Miner. Oracle Application Express comes with several graphical techniques, and the presentation uses them to create a sentiment dashboard.

Transforming Customer Experience with Big Data and Predictive Analytics [CON8148]

Delivering a high-quality customer experience is essential for long-term profitability and customer retention in the communications industry. Although service providers own a wealth of customer data within their systems, the sheer volume and complexity of the data structures inhibit their ability to extract the full value of the information. To change this situation, service providers are increasingly turning to a new generation of business intelligence tools. This session begins by discussing the key market challenges for business analytics and continues by exploring Oracle’s approach to meeting these challenges, including the use of predictive analytics, big data, and social network analytics.

There are a few others where Oracle Advanced Analytics is included e.g. Retail GBU, Big Data Strategy, etc. but they are typically more broadly focused.  If you search the Content Catalog for “Advanced Analytics” etc. you can find other related presentations that involve OAA.

Hope this helps.  Enjoy!

cb

Sunday Aug 10, 2014

Take a FREE Test Drive of Oracle Data Miner on Amazon Cloud - Offered by Vlamis Software, Oracle Partner

Thanks to a wonderful and extremely convenient and easy to use Amazon Cloud hosting by Vlamis Software, an Oracle Partner, you can now take a FREE Test Drive of Oracle Data Miner in about 10 minutes!  There are 3 simple steps:

Step 1—Fill out request

  •  Select the Oracle Advanced Analytics Test Drive


Step 2—Connect and Launch

  • Launch the Amazon Cloud instance and wait for the assigned IP address.  Vlamis has provided a nice YouTube instructional video that you should watch for instructions.

  • Connect with Remote Desktop


Step 3—Start Test Drive!

The Amazon Cloud that Vlamis has set up includes everything you'll need to try out Oracle Data Miner:

  • Oracle Database EE  11g Release 2
  • Oracle Advanced Analytics Option
  • SQL Developer 4.0/Oracle Data Miner GUI
  • Demo data for learning - this makes it fast and easy to get started.  The demo data covers multiple scenarios for simple graphing, classification, regression, market basket analysis, anomaly detection, text mining and mining star schema 360 degree customer views
  • Follow the Oracle Data Miner Tutorials that are provided.  These Tutorials are also available on the Oracle Technology Network


  • Try it out! 

Many thanks to Oracle Partner, Vlamis Software for this terrific Oracle Data Miner Test Drive on the Amazon Cloud. 

By the way, if interested, Vlamis is an authorized Instructor for the Oracle University 2 Day Instructor Led Course on Oracle Data Mining and provides data mining consulting and implementation assistance services.

Wednesday Aug 06, 2014

New Book: Predictive Analytics Using Oracle Data Miner


Great New Book Now Available:  Predictive Analytics Using Oracle Data Miner, by Brendan Tierney, Oracle ACE Director

If you have an Oracle Database and want to leverage that data to discover new insights, make predictions and generate actionable insights, this book is a must read for you!  In Predictive Analytics Using Oracle Data Miner: Develop & Use Oracle Data Mining Models in Oracle Data Miner, SQL & PL/SQL, Brendan Tierney, Oracle ACE Director and data mining expert, guides the user through the basic concepts of data mining and offers step by step instructions for solving data-driven problems using SQL Developer’s Oracle Data Mining extension.  Brendan takes it full circle by showing the reader how to deploy advanced analytical methodologies and predictive models immediately into enterprise-wide production environments using the in-database SQL and PL/SQL functionality.  

Definitely a must read for any Oracle data professional!

See Predictive Analytics Using Oracle Data Miner, by Brendan Tierney on Amazon.com  



Sunday May 18, 2014

Oracle Data Miner and Oracle R Enterprise Integration - Watch Demo

Oracle Data Miner and Oracle R Enterprise Integration - Watch Demo

Oracle Advanced Analytics (Database EE) Option turns the database into an enterprise-wide analytical platform that can quickly deliver enterprise-wide predictive analytics and actionable insights.  Oracle Advanced Analytics is comprised of both the Oracle Data Mining SQL data mining functions, Oracle Data Miner, an extension to SQL Developer that exposes the data mining SQL functions for data analysts, and Oracle R Enterprise which integrates the R statistical programming language with SQL.  15 powerful in-database SQL data mining functions, the SQL Developer/Oracle Data Miner workflow GUI and the ability to integrate open source R within an analytical methodology, makes the Oracle Database + Oracle Advanced Analytics Option the ideal platform for building and deploying enterprise-wide predictive analytics applications/solutions.  

In Oracle Data Miner 4.0 we added a new SQL Query node to allow users to insert arbitrary SQL scripts within an ODMr analytical workflow. Additionally, the SQL Query node allows users to leverage registered R scripts to extend Oracle Data Miner's analytical capabilities.  For applications that are mostly OAA/Oracle Data Mining SQL data mining functions based but require additional analytical techniques found in the R community, this is an ideal method for integrating the power of in-database SQL analytical and data mining functions with the flexibility of open source R.  For applications that are built entirely using the R statistical programming language, it may be more practical to stay within the R console or RStudio environments, but for SQL-centric in-database predictive methodologies, this integration is just what might satisfy your needs.

Watch this Oracle Data Miner and Oracle R Enteprise Integration YouTube to see the demo. 

There is an excellent related Oracle Data Miner:  Integrate Oracle R Enterprise Algorithms into workflow using the SQL Query node (pdf, companion files) white paper on this topic that includes examples on the Oracle Technology Network in the Oracle Data Mining pages.  

Tuesday May 06, 2014

Oracle Data Miner 4.0/SQLDEV 4.0 New Features - Watch Demo!

Oracle Data Miner 4.0 New Features 

Oracle Data Miner/SQLDEV 4.0 (for Oracle Database 11g and 12c)

  • New Graph node (box, scatter, bar, histograms)
  • SQL Query node + integration of R scripts
  • Automatic SQL script generation for deployment

Oracle Advanced Analytics 12c New SQL data mining algorithms/enhancements features exposed in Oracle Data Miner 4.0

  • Expectation Maximization Clustering algorithm
  • PCA & Singular Vector Decomposition algorithms
  • Decision Trees can also now mine unstructured data
  • Improved/automated Text Mining, Prediction Details and other algorithm improvements
  • SQL Predictive Queries—automatic build, apply within simple yet powerful SQL query


Sunday Apr 27, 2014

Real Time Association Rules Recommendation Engine

This blog shows how you can write a SQL query for Association Rules recommendation; such a query can be used to recommend products (cross-sell) to a customer based on products already placed in his current shopping cart.  Before we can perform the recommendation, we need to build an association rules model that based on previous customer sales transactions. For the demo, I will use the SALES and PRODUCTS tables found in the sample SH schema as input data and build the association model using the free Oracle Data Miner GUI tool.

Association Rules Model Workflow

The SALES table contains time based (TIME_ID) sales transactions of all customers (CUST_ID) product purchases (PROD_ID). The actual product names can be found in the PRODUCTS table, so we join these two tables to get the sales transactions with real product names (instead of looking up the product names using the PROD_ID later).

Enter the following Transaction ids (CUST_ID, TIME_ID) and item id (PROD_NAME) in the Association Rule Build node editor.

Enter the Maximum Rule length of 2 and Minimum Confidence and Support as followings. The lower the Confidence and Support percents will yield more rules; the higher the percents will yield fewer rules. We want the generated rules to have one Antecedent to one Consequent, so we set the Maximum Rule length to 2.

SQL Query for Recommendation

The following SQL query returns the top 3 products recommendation based on products placed in the customer’s current shopping cart.

SELECT rownum AS rank,

  consequent  AS recommendation

FROM

(

  WITH rules AS (

    SELECT AR.rule_id AS "ID",

      ant_pred.attribute_subname antecedent,

      cons_pred.attribute_subname consequent,

      AR.rule_support support,

      AR.rule_confidence confidence

    FROM TABLE(dbms_data_mining.get_association_rules('AR_RECOMMENDATION')) AR,

      TABLE(AR.antecedent) ant_pred,

      TABLE(AR.consequent) cons_pred

  ),

  cust_data AS (

    SELECT 'Comic Book Heroes' AS prod_name FROM DUAL

    UNION

    SELECT 'Martial Arts Champions' AS prod_name FROM DUAL

  )

  SELECT rules.consequent,

    MAX(rules.confidence) max_confidence,

    MAX(rules.support) max_support

  FROM rules, cust_data

  WHERE cust_data.prod_name = rules.antecedent

  AND rules.consequent NOT IN (SELECT prod_name FROM cust_data)

  GROUP BY rules.consequent

  ORDER BY max_confidence DESC, max_support DESC

)

WHERE rownum <=3;


The above SQL query consists of 3 main sections: association rules, current customer data, and product recommendation.

Association Rules

The first section returns the associated rules (antecedent, consequent) and associated confidence and support values discovered by the model (AR_RECOMMENDATION) that was built in the above workflow. You may find the DBMS_DATA_MINING.GET_ASSOCIATION_RULES function reference here.

  WITH rules AS (

    SELECT AR.rule_id AS "ID",

      ant_pred.attribute_subname antecedent,

      cons_pred.attribute_subname consequent,

      AR.rule_support support,

      AR.rule_confidence confidence

    FROM TABLE(dbms_data_mining.get_association_rules('AR_RECOMMENDATION')) AR,

      TABLE(AR.antecedent) ant_pred,

      TABLE(AR.consequent) cons_pred

Current Customer Data

The middle section defines the current customer product selection on the fly (real time). For example, we assume this customer placed the 'Comic Book Heroes' and 'Martial Arts Champions' products in the current shopping cart.

  cust_data AS (

    SELECT 'Comic Book Heroes' AS prod_name FROM DUAL

    UNION

    SELECT 'Martial Arts Champions' AS prod_name FROM DUAL

  )

Product Recommendation

Last but not least is the query to return the recommended products based on the discovered rules and current customer product selection. It is possible that the rules may suggest the same product (consequent) for different customer products (prod_name), so we aggregate the consequents using the MAX function on the confidence and support values. In case of duplicate recommendations, we just use the max confidence and support values for comparison. Moreover, we don’t want the recommended products that are already placed in the customer’s shopping cart, so we add the “NOT IN (SELECTprod_name FROM cust_data)” condition. Finally, the query returns the recommendations in the order of highest confident and support first.

  SELECT rules.consequent,

    MAX(rules.confidence) max_confidence,

    MAX(rules.support) max_support

  FROM rules, cust_data

  WHERE cust_data.prod_name = rules.antecedent

  AND rules.consequent NOT IN (SELECT prod_name FROM cust_data)

  GROUP BY rules.consequent

  ORDER BY max_confidence DESC, max_support DESC

The recommendation query returns the following recommendations for the 'Comic Book Heroes' and 'Martial Arts Champions' products.

RANK   RECOMMENDATION

---------- --------------------------------

         1   Xtend Memory

         2   Endurance Racing

         3   Adventures with Numbers

Alternative SQL Query for Recommendation

The first recommendation query may not be scalable; it returns all possible rules to be processed by the recommendation sub query. The more scalable approach is to push as much processing to the GET_ASSOCIATION_RULES function as possible, so that it returns minimal set of rules for further processing. Here we specify the topn=10, min_confidence=0.1, min_support=0.01, sort_order='RULE_CONFIDENCE DESC', 'RULE_SUPPORT DESC', and the antecedent items to the function, and let it finds the top 10 set of rules that satisfy these criteria. Once we obtain the refined rule set, we filter out recommendations that already in the customer’s shopping cart and also perform aggregation (use MAX() function) on the confidence and support values. Finally, we query the top 3 recommendations based on the order of highest confident and support first.

SELECT rownum AS rank,

  consequent  AS recommendation

FROM

  (SELECT cons_pred.attribute_subname consequent,

    MAX(AR.rule_support) max_support,

    MAX(AR.rule_confidence) max_confidence

  FROM TABLE (DBMS_DATA_MINING.GET_ASSOCIATION_RULES ( 'AR_RECOMMENDATION', 10, NULL, 0.1, 0.01, 2, 1, 

                 ORA_MINING_VARCHAR2_NT ( 'RULE_CONFIDENCE DESC', 'RULE_SUPPORT DESC'), 

                 DM_ITEMS(DM_ITEM('PROD_NAME', 'Comic Book Heroes', NULL, NULL), 

                          DM_ITEM('PROD_NAME', 'Martial Arts Champions', NULL, NULL)), NULL, 1)) AR, TABLE(AR.consequent) cons_pred

  WHERE cons_pred.attribute_subname NOT IN ('Comic Book Heroes', 'Martial Arts Champions')

  GROUP BY cons_pred.attribute_subname

  ORDER BY max_confidence DESC, max_support DESC

  )

WHERE rownum <=3;


Note: another consideration is to order the rules by the lift value; the higher the lift value the more accurate the recommendation.

SQL Query for Recommendation using Customer Previous Sales Transactions
I am going to extend the above recommendation SQL query to include the customer previous sales transactions, so that the recommendation is now based on the previous purchased products and the products in the current shopping cart. Moreover, we don’t want any recommended products that have been purchased previously or already placed in the current shopping cart. For this example, we use a window of 12 months since the last customer purchase as the past sales history used for recommendation.

To include the customer sales history (assume cust_id = 3), a hist_cust_data sub query is added to obtain the previously purchased products. A tot_cust_data sub query is added to include both the products in the current shopping cart and the previously purchased products. The following query returns top 3 recommendations based on customer previously purchased products in the last 12 months and the products in the current shopping cart.

SELECT rownum AS rank, consequent AS recommendation

FROM

(

  WITH rules AS (

    SELECT AR.rule_id AS "ID",

      ant_pred.attribute_subname antecedent,

      cons_pred.attribute_subname consequent,

      AR.rule_support support,

      AR.rule_confidence confidence,

      AR.rule_lift lift

    FROM TABLE(dbms_data_mining.get_association_rules('AR_RECOMMENDATION')) AR,

         TABLE(AR.antecedent) ant_pred,

         TABLE(AR.consequent) cons_pred

  ),

  cur_cust_data AS (

    SELECT 'Comic Book Heroes' AS PROD_NAME FROM DUAL

    UNION

    SELECT 'Martial Arts Champions' AS PROD_NAME FROM DUAL

  ),

  hist_cust_data AS(

    SELECT

      DISTINCT PROD_NAME

    FROM sh.sales s, sh.products p

    WHERE cust_id = 3

      AND s.prod_id = p.prod_id

      -- customer historical purchase for last 12 months

      AND time_id  >= add_months((SELECT MAX(time_id) FROM sh.sales WHERE cust_id = 3), -12) 

  ),

  tot_cust_data AS (

    SELECT PROD_NAME FROM cur_cust_data

    UNION

    SELECT PROD_NAME FROM hist_cust_data

  )

  SELECT rules.consequent,

    SUM(rules.lift) lift_sum,

    SUM(rules.confidence) confidence_sum,

    SUM(rules.support) support_sum

  FROM rules, tot_cust_data

  WHERE tot_cust_data.prod_name = rules.antecedent

    -- don't recommend products that customer already owned or about to purchase  

    AND rules.consequent NOT IN (SELECT prod_name FROM tot_cust_data) 

  GROUP BY rules.consequent

  ORDER BY lift_sum DESC, confidence_sum DESC, support_sum DESC

)

WHERE rownum <= 3;

Conclusion

This blog shows a few examples of how you can write a recommendation SQL query with different flavors (with or without historical sales transactions). You may also consider assign a profit for each product, so that you may come up with a query that returns the top most profitable product recommendations.

Tuesday Mar 18, 2014

Deploy Data Miner Apply Node SQL as RESTful Web Service for Real-Time Scoring

The free Oracle Data Miner GUI is an extension to Oracle SQL Developer that enables data analysts to work directly with data inside the database, explore the data graphically, build and evaluate multiple data mining models, apply Oracle Data Mining models to new data and deploy Oracle Data Mining's predictions and insights throughout the enterprise. The product enables a complete workflow deployment to a production system via generated PL/SQL scripts (See Generate a PL/SQL script for workflow deployment). This time I want to focus on the model scoring side, especially the single record real-time scoring. Would it be nice if the scoring function can be accessed by different systems on different platforms? How about deploying the scoring function as a Web Service? This way any system that can send HTTP request can invoke the scoring Web Service, and consume the returning result as they see fit. For example, you can have a mobile app that collects customer data, and then invokes the scoring Web Service to determine how likely the customer is going to buy a life insurance. This blog shows a complete demo from building predictive models to deploying a scoring function as a Web Service. However, the demo does not take into account of any authentication and security consideration related to Web Services, which is out of the scope of this blog.

Web Services Requirement

This demo uses the Web Services feature provided by the Oracle APEX 4.2 and Oracle REST Data Services 2.0.6 (formerly Oracle APEX Listener). Here are the installation instructions for both products:

For 11g Database

Go to the Oracle Application Express Installation Guide and following the instructions below:

1.5.1 Scenario 1: Downloading from OTN and Configuring the Oracle Application Express Listener

· Step 1: Install the Oracle Database and Complete Pre-installation Tasks

· Step 2: Download and Install Oracle Application Express

· Step 3: Change the Password for the ADMIN Account

· Step 4: Configure RESTful Services

· Step 5: Restart Processes

· Step 6: Configure APEX_PUBLIC_USER Account

· Step 7: Download and Install Oracle Application Express Listener

· Step 8: Enable Network Services in Oracle Database 11g

· Step 9: Security Considerations

· Step 10: About Developing Oracle Application Express in Other Languages

· Step 11: About Managing JOB_QUEUE_PROCESSES

· Step 12: Create a Workspace and Add Oracle Application Express Users


For 12c Database

Go to Oracle Application Express Installation Guide (Release 4.2 for Oracle Database 12c) and following the instructions below:

4.4 Installing from the Database and Configuring the Oracle Application Express Listener

· Install the Oracle Database and Complete Preinstallation Tasks

· Download and Install Oracle Application Express Listener

· Configure RESTful Services

· Enable Network Services in Oracle Database 12c

· Security Considerations

· About Running Oracle Application Express in Other Languages

· About Managing JOB_QUEUE_PROCESSES

· Create a Workspace and Add Oracle Application Express Users


Note: The APEX is pre-installed with the Oracle database 12c, but you need to configure it in order to use it.

For this demo, create a Workspace called DATAMINER that is based on an existing user account that has already been granted access to the Data Miner (this blog assumes DMUSER is the Data Miner user account). Please refer to the Oracle By Example Tutorials to review how to create a Data Miner user account and install the Data Miner Repository. In addition, you need to create an APEX user account (for simplicity I use DMUSER).

Build Models to Predict BUY_INSURANCE

This demo uses the demo data set, INSUR_CUST_LTV_SAMPLE, that comes with the Data Miner installation. Now, let’s use the Classification Build node to build some models using the CUSTOMER_ID as the case id and BUY_INSURANCE as the target.

Evaluate the Models

Nice thing about the Build node is that it builds a set of models with different algorithms within the same mining function by default, so we can select the best model to use. Let’s look at the models in the Test Viewer; here we can compare the models by looking at their Predictive Confidence, Overall Accuracy, and Average Accuracy values. Basically, the model with the highest values across these three metrics is the good one to use. As you can see, the winner here is the CLAS_DT_3_6 decision tree model.

Next, let’s see what input data columns are used as predictors for the decision tree model. You can find that information in the Model Viewer below. Surprisingly, it only uses a few columns for the prediction. These columns will be our input data requirement for the scoring function, the rest of the input columns can be ignored.


Score the Model

Let’s complete the workflow with an Apply node, from which we will generate the scoring SQL statement to be used for the Web Service. Here we reuse the INSUR_CUST_LTV_SAMPLE data as input data to the Apply node, and select only the required columns as found in the previous step. Also, in the Class Build node we deselect the other models as output in the Property Inspector (Models tab), so that only decision tree model will be used for the Apply node. The generated scoring SQL statement will use only the decision tree model to score against the limited set of input columns.

Generate SQL Statement for Scoring

After the workflow is run successfully, we can generate the scoring SQL statement via the “Save SQL” context menu off the Apply node as shown below.

Here is the generated SQL statement:

/* SQL Deployed by Oracle SQL Developer 4.1.0.14.78 from Node "Apply", Workflow "workflow score", Project "project", Connection "conn_12c" on Mar 16, 2014 */
ALTER SESSION set "_optimizer_reuse_cost_annotations"=false;
ALTER SESSION set NLS_NUMERIC_CHARACTERS=".,";
--ALTER SESSION FOR OPTIMIZER
WITH
/* Start of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
"N$10013" as (select /*+ inline */ "INSUR_CUST_LTV_SAMPLE"."BANK_FUNDS",
"INSUR_CUST_LTV_SAMPLE"."CHECKING_AMOUNT",
"INSUR_CUST_LTV_SAMPLE"."CREDIT_BALANCE",
"INSUR_CUST_LTV_SAMPLE"."N_TRANS_ATM",
"INSUR_CUST_LTV_SAMPLE"."T_AMOUNT_AUTOM_PAYMENTS"
from "DMUSER"."INSUR_CUST_LTV_SAMPLE" )
/* End of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
,
/* Start of sql for node: Apply */
"N$10011" as (SELECT /*+ inline */
PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PRED",
PREDICTION_PROBABILITY("DMUSER"."CLAS_DT_3_6", PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) USING *) "CLAS_DT_3_6_PROB",
PREDICTION_COST("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PCST"
FROM "N$10013" )
/* End of sql for node: Apply */
select * from "N$10011";

We need to modify the first SELECT SQL statement to change the data source from a database table to a record that can be constructed on the fly, which is crucial for real-time scoring. The bind variables (e.g. :funds) are used; these variables will be replaced with actual data (passed in by the Web Service request) when the SQL statement is executed.

/* SQL Deployed by Oracle SQL Developer 4.1.0.14.78 from Node "Apply", Workflow "workflow score", Project "project", Connection "conn_12c" on Mar 16, 2014 */
WITH
/* Start of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
"N$10013" as (select /*+ inline */
:funds "BANK_FUNDS",
:checking "CHECKING_AMOUNT",
:credit "CREDIT_BALANCE",
:atm "N_TRANS_ATM",
:payments "T_AMOUNT_AUTOM_PAYMENTS"
from DUAL
)
/* End of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
,
/* Start of sql for node: Apply */
"N$10011" as (SELECT /*+ inline */
PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PRED",
PREDICTION_PROBABILITY("DMUSER"."CLAS_DT_3_6", PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) USING *) "CLAS_DT_3_6_PROB",
PREDICTION_COST("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PCST"
FROM "N$10013" )
/* End of sql for node: Apply */
select * from "N$10011";

Create Scoring Web Service

Assume the Oracle APEX and Oracle REST Data Services have been properly installed and configured; we can proceed to create a RESTful web service for real-time scoring. The followings describe the steps to create the Web Service in APEX:

1. APEX Login

You can bring up the APEX login screen by pointing your browser to http://<host>:<port>/ords. Enter your Workspace name and account info to login. The Workspace should be based on the Data Miner DMUSER account for this demo to work.

2. Select SQL Workshop

Select the SQL Workshop icon to proceed.

3. Select RESTful Services

Select the RESTful Services to create the Web Service.

Click the “Create” button to continue.

4. Define Restful Services

Enter the following information to define the scoring Web Service in the RESTful Services Module form:

Name: buyinsurance

URI Prefix: score/

Status: Published

URI Template: buyinsurance?funds={funds}&checking={checking}&credit={credit}&atm={atm}&payments={payments}

Method: GET

Source Type: Query Format: CSV

Source:

/* SQL Deployed by Oracle SQL Developer 4.1.0.14.78 from Node "Apply", Workflow "workflow score", Project "project", Connection "conn_11204" on Mar 16, 2014 */
WITH
/* Start of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
"N$10013" as (select /*+ inline */
:funds "BANK_FUNDS",
:checking "CHECKING_AMOUNT",
:credit "CREDIT_BALANCE",
:atm "N_TRANS_ATM",
:payments "T_AMOUNT_AUTOM_PAYMENTS"
from DUAL
)
/* End of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
,
/* Start of sql for node: Apply */
"N$10011" as (SELECT /*+ inline */
PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PRED",
PREDICTION_PROBABILITY("DMUSER"."CLAS_DT_3_6", PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) USING *) "CLAS_DT_3_6_PROB",
PREDICTION_COST("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PCST"
FROM "N$10013" )
/* End of sql for node: Apply */
select * from "N$10011";

Note: JSON output format is supported.

Lastly, create the following parameters that are used to pass the data from the Web Service request (URI) to the bind variables used in the scoring SQL statement.

The final RESTful Services Module definition should look like the following. Make sure the “Requires Secure Access” is set to “No” (HTTPS secure request is not addressed in this demo).

Test the Scoring Web Service

Let’s create a simple web page using your favorite HTML editor (I use JDeveloper to create this web page). The page includes a form that is used to collect customer data, and then fires off the Web Service request upon submission to get a prediction and associated probability.

Here is the HTML source of the above Form:

<!DOCTYPE html>

<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

<title>score</title>

</head>

<body>

<h2>

Determine if Customer will Buy Insurance

</h2>

<form action="http://localhost:8080/ords/dataminer/score/buyinsurance" method="get">

<table>

<tr>

<td>Bank Funds:</td>

<td><input type="text" name="funds"/></td>

</tr>

<tr>

<td>Checking Amount:</td>

<td><input type="text" name="checking"/></td>

</tr>

<tr>

<td>Credit Balance:</td>

<td><input type="text" name="credit"/></td>

</tr>

<tr>

<td>Number ATM Transactions:</td>

<td><input type="text" name="atm"/></td>

</tr>

<tr>

<td>Amount Auto Payments:</td>

<td><input type="text" name="payments"/></td>

</tr>

<tr>

<td colspan="2" align="right">

<input type="submit" value="Score"/>

</td>

</tr>

</table>

</form>

</body>
</html>

When the Score button is pressed, the form sends a GET HTTP request to the web server with the collected form data as name-value parameters encoded in the URL.

checking=%7bchecking%7d&credit=%7bcredit%7d&atm=%7batm%7d&payments=%7bpayments%7d">http://localhost:8080/ords/dataminer/score/buyinsurance?funds={funds}&checking={checking}&credit={credit}&atm={atm}&payments={payments}

Notice the {funds}, {checking}, {credit}, {atm}, {payments} will be replaced with actual data from the form. This URI matches the URI Template specified in the RESTful Services Module form above.

Let’s test out the scoring Web Service by entering some values in the form and hit the Score button to see the prediction.

The prediction along with its probability and cost is returned as shown below. Unfortunately, this customer is less likely to buy insurance.

Let’s change some values and see if we have any luck.

Bingo! This customer is more likely to buy insurance.

Conclusion

This blog shows how to deploy Data Miner generated scoring SQL as Web Service, which can be consumed by different systems on different platforms from anywhere. In theory, any SQL statement generated from the Data Miner node could potentially be made as Web Services. For example, we can have a Web Service that returns Model Details info, and this info can be consumed by some BI tool for application integration purpose.

Wednesday Feb 26, 2014

How to generate training and test dataset using SQL Query node in Data Miner

Overview

In Data Miner, the Classification and Regression Build nodes include a process that splits the input dataset into training and test dataset internally, which are then used by the model build and test processes within the nodes. This internal data split feature alleviates user from performing external data split, and then tie the split dataset into a build and test process separately as found in other competitive products. However, there are times user may want to perform an external data split. For example, user may want to generate a single training and test dataset, and reuse them in multiple workflows. The generation of training and test dataset can be done easily via the SQL Query node.

Stratified Split

The stratified split is used internally by the Classification Build node, because this technique can preserve the categorical target distribution in the resulting training and test dataset, which is important for the classification model build. The following shows the SQL statements that are essentially used by the Classification Build node to produce the training and test dataset internally:

SQL statement for Training dataset

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by {target column} ORDER BY ORA_HASH({case id column})) "_partition_caseid"

FROM {input data} a

) v1,

(

-- get the count of subgroups based on target classes

SELECT {target column},

COUNT(*) "_partition_target_cnt"

FROM {input data} GROUP BY {target column}

) v2

WHERE v1. {target column} = v2. {target column}

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) <= (v2."_partition_target_cnt" * {percent of training dataset} / 100)


SQL statement for Test dataset

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by {target column} ORDER BY ORA_HASH({case id column})) "_partition_caseid"

FROM {input data} a

) v1,

(

-- get the count of subgroups based on target classes

SELECT {target column},

COUNT(*) "_partition_target_cnt"

FROM {input data} GROUP BY {target column}

) v2

WHERE v1. {target column} = v2. {target column}

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) > (v2."_partition_target_cnt" * {percent of training dataset} / 100)

The followings describe the placeholders used in the SQL statements:

{target column} - target column. It must be categorical type.

{case id column} - case id column. It must contain unique numbers that identify the rows.

{input data} - input data set.

{percent of training dataset} - percent of training dataset. For example, if you want to split 60% of input dataset into training dataset, use the value 60. The test dataset will contain 100%-60% = 40% of the input dataset. The training and test dataset are mutually exclusive.

Random Split

The random split is used internally by the Regression Build node because the target is usually numerical type. The following shows the SQL statements that are essentially used by the Regression Build node to produce the training and test dataset:

SQL statement for Training dataset

SELECT

v1.*

FROM

{input data} v1

WHERE ORA_HASH({case id column}, 99, 0) <= {percent of training dataset}

SQL statement for Test dataset

SELECT

    v1.*

FROM

{input data} v1

WHERE ORA_HASH({case id column}, 99, 0) > {percent of training dataset}

The followings describe the placeholders used in the SQL statements:

{case id column} - case id column. It must contain unique numbers that identify the rows.

{input data} - input data set.

{percent of training dataset} - percent of training dataset. For example, if you want to split 60% of input dataset into training dataset, use the value 60. The test dataset will contain 100%-60% = 40% of the input dataset. The training and test dataset are mutually exclusive.

Use SQL Query node to create training and test dataset

Assume you want to create the training and test dataset out of the demo INSUR_CUST_LTV_SAMPLE dataset using the stratified split technique; you can create the following workflow to utilize the SQL Query nodes to execute the above split SQL statements to generate the dataset, and then use the Create Table nodes to persist the resulting dataset.

Assume the case id is CUSTOMER_ID, target is BUY_INSURANCE, and the training dataset is 60% of the input dataset. You can enter the following SQL statement to create the training dataset in the “SQL Query Stratified Training” SQL Query node:

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by "BUY_INSURANCE" ORDER BY ORA_HASH("CUSTOMER_ID")) "_partition_caseid"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" a

) v1,

(

-- get the count of subgroups based on target classes

SELECT "BUY_INSURANCE",

COUNT(*) "_partition_target_cnt"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" GROUP BY "BUY_INSURANCE"

) v2

WHERE v1."BUY_INSURANCE" = v2."BUY_INSURANCE"

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) <= (v2."_partition_target_cnt" * 60 / 100)



Likewise, you can enter the following SQL statement to create the test dataset in the “SQL Query Stratified Test” SQL Query node:

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by "BUY_INSURANCE" ORDER BY ORA_HASH("CUSTOMER_ID")) "_partition_caseid"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" a

) v1,

(

-- get the count of subgroups based on target classes

SELECT "BUY_INSURANCE",

COUNT(*) "_partition_target_cnt"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" GROUP BY "BUY_INSURANCE"

) v2

WHERE v1."BUY_INSURANCE" = v2."BUY_INSURANCE"

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) > (v2."_partition_target_cnt" * 60 / 100)

Now run the workflow to create the training and test dataset. You can find the table names of the persisted dataset in the associated Create Table nodes.


Conclusion

This blog shows how easily to create the training and test dataset using the stratified split SQL statements via the SQL Query nodes. Similarly, you can generate the training and test dataset using the random split technique by replacing SQL statements with the random split SQL statements in the SQL Query nodes in the above workflow. If a large dataset (tens of millions of rows) is used in multiple model build nodes, it may be a good idea to split the data ahead of time to optimize the overall processing time (avoid multiple internal data splits inside the model build nodes).

Friday Feb 14, 2014

dunnhumby Accelerates Complex Segmentation Queries from Weeks to Minutes—Gains Competitive Advantage

See original story on http://www.oracle.com/us/corporate/customers/customersearch/dunnhumby-1-exadata-ss-2137635.html

dunnhumby Accelerates Complex Segmentation Queries from Weeks to Minutes—Gains Competitive Advantage

dunnhumby is the world’s leading customer-science company. It analyzes customer data and applies insights from more than 400 million customers across the globe to create better customer experiences and build loyalty. With its unique analytical capabilities, dunnhumby helps retailers better serve customers, create a competitive advantage, and enjoy sustained growth.


Challenges

A word from dunnhumby Ltd.

  • “Oracle Exadata Database Machine is helping us to transform our business and improve our competitive edge. We can now complete queries that took weeks in just minutes—driving new product offerings, more competitive bids, and more accurate analyses based on 100% of data instead of just a sampling.” – Chris Wones, Director of Data Solutions, dunnhumby USA

  • Expand breadth of services to maintain a competitive advantage in the customer-science industry
  • Provide clients, including major retail organizations in the United Kingdom and North America, with expanded historical and real-time insight into customer behavior, buying tendencies, and response to promotional campaigns and product offerings
  • Ensure competitive pricing for the company’s customer-analysis services while delivering added value to a growing client base
  • Analyze growing volumes of data rapidly and comprehensively
  • Ensure the security of sensitive information, including protected personal information to reduce risk and support compliance
  • Protect against data loss and reduce the backup and recovery window, as data is crucial to the competitive advantage and success of the business
  • Optimize IT investment and performance across the technology-intensive business
  • Reduce licensing and maintenance costs of previous analytical and data warehouse software

Solutions

  • Deployed Oracle Exadata Database Machine and accelerated queries that previously took two-to-three weeks to just minutes, enabling the company to bid on more complex, custom analyses and gain a competitive advantage
  • Achieved 4x to 30x more data compression using Hybrid Columnar Compression and Oracle Advanced Compression across sets—reducing storage requirements, increasing analysis and backup performance, and optimizing IT investment
  • Consolidated data marts securely with data warehouse schemas in Oracle Exadata, enabling extremely faster presummarizations of large volumes of data
  • Accelerated analytic capabilities to near real time using Oracle Advanced Analytics and third-party tools, enabling analysis of unstructured big data from emerging sources, like smart phones
  • Accelerated segmentation and customer-loyalty analysis from one week to just four hours—enabling the company to deliver more timely information as well as finer-grained analysis
  • Improved analysts’ productivity and focus as they can now run queries and complete analysis without having to wait hours or days for a query to process
  • Generated more accurate business insights and marketing recommendations with the ability to analyze 100% of data—including years of historical data—instead of just a small sample
  • Improved accuracy of marketing recommendations by analyzing larger sample sizes and predicting the market’s reception to new product ideas and strategies
  • Improved secure processing and management of 60 TB of data, growing at a rate of 500 million customer records a week, including information from clients’ customer loyalty programs 
  • Ensured data security and compliance with requirements for safeguarding protected personal information and reduced risk with Oracle Advanced Security, Oracle Directory Services Plus, and Oracle Enterprise Single Sign-On Suite Plus
  • Gained high-performance identity virtualization, storage, and synchronization services that meet the needs of the company’s high-volume environment
  • Ensured performance scalability even with concurrent queries with Oracle Exadata, which demonstrated higher throughput than competing solutions under such conditions
  • Deployed integrated backup and recovery using Oracle’s Sun ZFS Backup Appliance—to support high performance and continuous availability and act as a staging area for both inbound and outbound extract, transform, and load processes

Why Oracle

dunnhumby considered Teradata, IBM Netezza, and other solutions, and selected Oracle Exadata for its ability to sustain high performance and throughput even during concurrent queries. “We needed to see how the system performed when scaled to multiple concurrent queries, and Oracle Exadata’s throughput was much higher than competitive offerings,” said Chris Wones, director of data solutions, dunnhumby, USA.

Implementation Process

dunnhumby began its Oracle Exadata implementation in September 2012 and went live in April 2013. It has installed four Oracle Exadata machine units in the United States and four in the United Kingdom. The company is using three of the four machines in each country as production environments and one machine in each country for development and testing. dunnhumby runs an active-active environment across its Oracle Exadata clusters to ensure high availability.

Monday Feb 03, 2014

How to generate Scatterplot Matrices using R script in Data Miner

Data Miner provides Explorer node that produces descriptive statistical data and histogram graph, which allows analyst to analyze input data columns individually. Often time an analyst is interested in analyzing the relationships among the data columns, so that he can choose the columns that are closely correlated to the target column for model build purpose. To examine relationships among data columns, he can create scatter plots using the Graph node.

For example, an analyst may want to build a regression model that predicts the customer LTV (long term value) using the INSUR_CUST_LTV_SAMPLE demo data. Before building the model, he can create the following workflow with the Graph node to examine the relationships between interested data columns and the LTV target column.

In the Graph node editor, create a scatter plot with an interested data column (X Axis) against the LTV target column (Y Axis). For the demo, let’s create three scatter plots using these data columns: HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT.

Here are the scatter plots generated by the Graph node. As you can see the HOUSE_OWNERSHIP and N_MORTGAGES are quite positively correlated to the LTV target column. However, the MORTGAGE_AMOUNT seems less correlated to the LTV target column.

The problem with the above approach is it is laborious to create scatter plots one by one and you cannot examine relationships among those data columns themselves. To solve the problem, we can create a Scatterplot matrix graph as the following:

This is a 4 x4 scatterplot matrix of data column LTV, HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT. In the top row, you can examine the relationships between HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT against the LTV target column. In the second row, you can examine the relationships between LTV, N_MORTGAGES, and MORTGAGE_AMOUNT against the HOUSE_OWNERSHIP column. In the third and forth rows, you can examine the relationships of other columns against the N_MORTGAGES, and MORTGAGE_AMOUNT respectively.

To generate this scatterplot matrix, we need to invoke the readily available R script RQG$pairs (via the SQL Query node) in the Oracle R Enterprise. Please refer to http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/index.html?ssSourceSiteId=ocomen for Oracle R Enterprise installation.

Let’s create the following workflow with the SQL Query node to invoke the R script. Note: a Sample node may be needed to sample down the data size (e.g. 1000 rows) for large data set before it is used for charting.

Enter the following SQL statement in the SQL Query editor. The rqTableEval is a R SQL function that allows user to invoke R script from the SQL side. The first SELECT statement within the function specifies the input data (LTV, HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT). The second SELECT statement specifies the optional parameter to the R script, where we define the graph title “Scatterplot Matrices”. The output of the function is an XML document with the graph data embedded in it.

SELECT VALUE FROM TABLE
(
rqTableEval(
cursor(select "INSUR_CUST_LTV_SAMPLE_N$10001"."LTV",
"INSUR_CUST_LTV_SAMPLE_N$10001"."HOUSE_OWNERSHIP",
"INSUR_CUST_LTV_SAMPLE_N$10001"."N_MORTGAGES",
"INSUR_CUST_LTV_SAMPLE_N$10001"."MORTGAGE_AMOUNT"
from "INSUR_CUST_LTV_SAMPLE_N$10001"), -- Input Cursor
cursor(select 'Scatterplot Matrices' as MAIN from DUAL), -- Param Cursor
'XML', -- Output Definition
'RQG$pairs' -- R Script
)
)

You can see what default R scripts are available in the R Scripts tab. This tab is visible only when the Oracle R Enterprise installation is detected.

Click the button in the toolbar to invoke the R script to produce the Scatterplot matrix below.

You can copy the Scatterplot matrix image to a clipboard or save it to an image file (PNG) for reporting purpose. To do so, right click on the graph to bring up the pop-up menu below.

The Scatterplot matrix is also available in the Data Viewer of the SQL Query node. To open the Data Viewer, select the “View Data” item in the pop-up menu of the node.

The returning XML data is shown in the Data Viewer as shown below. To view the Scatterplot matrix embedded in the data, click on the XML data to bring up the icon in the far right of the cell, and then click on the icon to bring up the viewer.

Tuesday Jan 14, 2014

How to export data from the Explore Node using Data Miner and SQL Developer

Blog posting by Denny Wong, Principal Member of Technical Staff, User Interfaces and Components, Oracle Data Mining Development

The Explorer node generates descriptive statistical data and histogram data for all input table columns.  These statistical and histogram data may help user to analyze the input data to determine if any action (e.g. transformation) is needed before using it for data mining purpose.  An analyst may want to export this data to a file for offline analysis (e.g. Excel) or reporting purpose.  The Explorer node generates this data to a database table specified in the Output tab of the Property Inspector.  In this case, the data is generated to a table named “OUTPUT_1_2”.


To export the table to a file, we can use the SQL Developer Export wizard. Go to the Connections tab in the Navigator Window, search for the table “OUTPUT_1_2” within the proper connection, then bring up the pop-up menu off the table. Click on the Export menu to launch the Export Wizard.


In the wizard, uncheck the “Export DDL” and select the “Export Data” option since we are only interested in the data itself. In the Format option, select “excel” in this example (a dozen of output formats are supported) and specify the output file name. Upon wizard finish, an excel file is generated.


Let’s open the file to examine what is in it. As expected, it contains all statistical data for all input columns. The histogram data is listed as the last column (HISTOGRAMS), and it has this ODMRSYS.ODMR_HISTOGRAMS structure.


For example, let’s take a closer look at the histogram data for the BUY_INSURANCE column:

ODMRSYS.ODMR_HISTOGRAMS(ODMRSYS.ODMR_HISTOGRAM_POINT('"BUY_INSURANCE"',''No'',NULL,NULL,73.1),ODMRSYS.ODMR_HISTOGRAM_POINT('"BUY_INSURANCE"',''Yes'',NULL,NULL,26.9))

This column contains an ODMRSYS.ODMR_HISTOGRAMS object which is an array of ODMRSYS.ODMR_HISTOGRAM_POINT structure. We can describe the structure to see what is in it.


The ODMRSYS.ODMR_HISTOGRAM_POINT contains five attributes, which represent the histogram data. The ATTRIBUTE_NAME contains the attribute name (e.g. BUY_INSURANCE), the ATTRIBUTE_VALUE contains the attribute values (e.g. No, Yes), the GROUPING_ATTRIBUTE_NAME and GROUPING_ ATTRIBUTE_VALUE are not used (these fields are used when the Group By option is specified), and the ATTRIBUTE_PERCENT contains the percents (e.g. 73.1, 26.9) for the attribute values respectively.


As you can see the ODMRSYS.ODMR_HISTOGRAMS complex output format may be difficult to read and it may require some processing before the data can be used. Alternatively, we can “unnest” the histogram data to transactional data format before exporting it. This way we don’t have to deal with the complex array structure, thus the data is more consumable. To do that, we can write a simple SQL query to “unnest” the data and use the new SQL Query node (Extract histogram data) to run this query (see below). We then use a Create Table node (Explorer output table) to persist the “unnested” histogram data along with the statistical data.

1. Create a SQL Query node

Create a SQL Query node and connect the “Explore Data” node to it. You may rename the SQL Query node to “Extract histogram data” to make it clear it is used to “unnest” the histogram data.

2. Specify a SQL query to “unnest” histogram data

Double click the “Extract histogram data” node to bring up the editor, enter the following SELECT statement in the editor:

SELECT
    "Explore Data_N$10002"."ATTR",
    "Explore Data_N$10002"."AVG",
    "Explore Data_N$10002"."DATA_TYPE",
    "Explore Data_N$10002"."DISTINCT_CNT",
    "Explore Data_N$10002"."DISTINCT_PERCENT",
    "Explore Data_N$10002"."MAX",
    "Explore Data_N$10002"."MEDIAN_VAL",
    "Explore Data_N$10002"."MIN",
    "Explore Data_N$10002"."MODE_VALUE",
    "Explore Data_N$10002"."NULL_PERCENT",
    "Explore Data_N$10002"."STD",
    "Explore Data_N$10002"."VAR",
    h.ATTRIBUTE_VALUE,
    h.ATTRIBUTE_PERCENT
FROM
    "Explore Data_N$10002", TABLE("Explore Data_N$10002"."HISTOGRAMS") h

Click OK to close the editor. This query is used to extract out the ATTRIBUTE_VALUE and ATTRIBUTE_PERCENT fields from the ODMRSYS.ODMR_HISTOGRAMS nested object.

Note: you may select only columns that contain the statistics you are interested in.  The "Explore Data_N$10002" is a generated unique name reference to the Explorer node, you may have a slightly different name ending with some other unique number. 

The query produces the following output.  The last two columns are the histogram data in transactional format.

3. Create a Create Table node to persist the “unnested” histogram data

Create a Create Table node and connect the “Extract histogram data” node to it. You may rename the Create Table node to “Explorer output table” to make it clear it is used to persist the “unnested” histogram data.


4. Export “unnested” histogram data to Excel file

Run the “Explorer output table” node to persist the “unnested” histogram data to a table. The name of the output table (OUTPUT_3_4) can be found in the Property Inspector below.


Next, we can use the SQL Developer Export wizard as described above to export the table to an Excel file. As you can see the histogram data are now in transactional format; they are more readable and can readily be consumed.


Tuesday Dec 31, 2013

Oracle BIWA Summit 2014 January 14-16, 2014 at Oracle HQ in Redwood Shores, CA


Oracle Business Intelligence, Warehousing & Analytics Summit - Redwood City

Oracle is a proud sponsor of the Business Intelligence, Warehousing & Analytics (BIWA) Summit happening January 14 – 16 at the Oracle Conference Center in Redwood City. The Oracle BIWA Summit brings together Oracle ACE experts, customers who are currently using or planning to use Oracle BI, Warehousing and Analytics products and technologies, partners and Oracle Product Managers, Support Personnel and Development Managers. Join us on Tuesday, January 14 at 5 p.m. to hear featured speaker Balaji Yelamanchili, Senior Vice President Analytics and Performance Management Products, for his keynote: Oracle Business Intelligence -- Innovate Faster. Visit the BIWA site http://www.biwasummit.com/ for more information today.

 Among the approximately 50 technical presentations, featured talks a Hands on Labs, I'll be delivering a presentation on Oracle Advanced Analytics and a Hands on Lab on using the OAA/Oracle Data Miner GUI.  

 AA-1010 BEST PRACTICES FOR IN-DATABASE ANALYTICS

Session ID: AA-1010

Presenter: Charlie Berger, Oracle

Abstract:

In the era of Big Data, enterprises are acquiring increasing volumes and varieties of data from a rapidly growing range of internet, mobile, sensor and other real-time and near real-time sources.  The driving force behind this trend toward Big Data analysis is the ability to use this data for “actionable intelligence” -- to predict patterns and behaviors and to deliver essential information when and where it is needed. Oracle Database uniquely offers a powerful platform to perform this predictive analytics and location analysis with in-database data mining, statistical processing and SQL Analytics.  Oracle Advanced Analytics embeds powerful data mining algorithms and adds enterprise scale open source R to solve problems such as predicting customer behavior, anticipating churn, detecting fraud, market basket analysis and discovering customer segments.  Oracle Data Miner GUI , a new SQL Developer 4.0 Extension, enables business analysts to quickly analyze data and visualize data, build, evaluate and apply predictive models and deploy via SQL scripts sophisticated predictive analytics methodologies—all while keeping the data inside the Oracle Database.  Come learn best practices and customer examples for exploiting Oracle’s scalable, performant and secure in-database analytics capabilities to extract more value and actionable intelligence from your data.

HOL-AA-1008 LEARN TO USE ORACLE ADVANCED ANALYTICS FOR PREDICTIVE ANALYTICS SOLUTIONS

Session ID: HOL-AA-1008

Presenter: Charles Berger, Oracle & Karl Rexer, Rexer Analytics

Abstract:

Big Data;  Bigger Insights!  Oracle Data Mining Release 12c, a component of the Oracle Advanced Analytics database Option, embeds powerful data mining algorithms in the SQL kernel of the Oracle Database for problems such as predicting customer behavior, anticipating churn, identifying up-sell and cross-sell, detecting anomalies and potential fraud, market basket analysis, customer profiling, text mining and retail market basket analysis.  Oracle Data Miner GUI , a new SQL Developer 4.0 Extension, enables business analysts to quickly analyze data and visualize data, build, evaluate and apply predictive models and develop sophisticated predictive analytics methodologies—all while keeping the data inside Oracle Database.  Come see how easily you can discover big insights from your Oracle data and generate SQL scripts for deployment and automation and deploying results into Oracle Business Intelligence (OBIEE) dashboards. 

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Monday Dec 09, 2013

Come See and Test Drive Oracle Advanced Analytics at the BIWA Summit'14, Jan 14-16, 2014

The BIWA Summit '14 January 14-16 at Oracle HQ Conference Center Detailed Agenda is now published.   

Please share with your others by Tweeting, Blogging, Facebook, LinkedIn, Email, etc.!

The BIWA Summit is known for novel and interesting use cases of Oracle Big Data, Exadata, Advanced Analytics/Data Mining, OBIEE, Spatial, Endeca and more!    Opportunities to get hands on experience with products in the Hands on Labs, great customer case studies and talks by Oracle Technical Professionals and Partners.  Meet with technical experts.  Click HERE to read detailed abstracts and speaker profiles. 

Use the SPECIAL DISCOUNT code ORACLE12C and registration is only $199 for the 2.5 day technically focused Oracle user group event.

Charlie  (Oracle Employee Advisor to Oracle BIWA Special Interest User Group)

----


<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Tuesday Nov 12, 2013

Oracle Big Data Learning Library

Click on LEARN BY PRODUCT to view all learning resources.

Oracle Big Data Essentials

Attend this Oracle University Course!

Using Oracle NoSQL Database

Attend this Oracle University class!

Oracle and Big Data on OTN

See the latest resource on OTN.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday Sep 04, 2013

Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into workflow using the SQL Query node

I posted a new white paper authored by Denny Wong, Principal Member of Technical Staff, User Interfaces and Components, Oracle Data Mining Technologies.  You can access the white paper here and the companion files here.  Here is an excerpt:

Oracle Data Miner (Extension of SQL Developer 4.0) 

Integrate Oracle R Enterprise Mining Algorithms into workflow using the SQL Query node

Oracle R Enterprise (ORE), a component of the Oracle Advanced Analytics Option, makes the open source R statistical programming language and environment ready for the enterprise and big data. Designed for problems involving large amounts of data, Oracle R Enterprise integrates R with the Oracle Database. R users can develop, refine and deploy R scripts that leverage the parallelism and scalability of the database to perform predictive analytics and data analysis.

Oracle Data Miner (ODMr) offers a comprehensive set of in-database algorithms for performing a variety of mining tasks, such as classification, regression, anomaly detection, feature extraction, clustering, and market basket analysis. One of the important capabilities of the new SQL Query node in Data Miner 4.0 is a simplified interface for integrating R scripts registered with the database. This provides the support necessary for R Developers to provide useful mining scripts for use by data analysts. This synergy provides many additional benefits as noted below.

· R developers can further extend ODMr mining capabilities by incorporating the extensive R mining algorithms from the open source CRAN packages or leveraging any user developed custom R algorithms via SQL interfaces provided by ORE.

· Since this SQL Query node can be part of a workflow process, R scripts can leverage functionalities provided by other workflow nodes which can simplify the overall effort of integrating R capabilities within the database.

· R mining capabilities can be included in the workflow deployment scripts produced by the new sql script generation feature. So the ability of deploy R functionality within the context of an Data Miner workflow is easily accomplished.

· Data and processing are secured and controlled by the Oracle Database. This alleviates a lot of risk that are incurred by other providers, when users have to export data out of the database in order to perform advanced analytics.

Oracle Advanced Analytics saves analysts, developers, database administrators and management the headache of trying to integrate R and database analytics. Instead, users can quickly gain the benefit of new R analytics and spend their time and effort on developing business solutions instead of building homegrown analytical platforms.

This paper should be very useful to R developers wishing to better understand how to leverage imbedding R Scripts for use by Data Analysts.  Analysts will also find the paper useful to see how R features can be surfaced for their use in Data Miner. The specific use case covered demonstrates how to use the SQL Query node to integrate R glm and rpart regression model build, test, and score operations into the workflow along with nodes that perform data preparation and residual plot graphing. However, the integration process described here can easily be adapted to integrate other R operations like statistical data analysis and advanced graphing to expand ODMr functionalities.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>
About

Everything about Oracle Data Mining, a component of the Oracle Advanced Analytics Option - News, Technical Information, Opinions, Tips & Tricks. All in One Place

Search

Categories
  • Oracle
Archives
« February 2016
SunMonTueWedThuFriSat
 
1
2
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
     
       
Today