Wednesday Mar 09, 2016

Learn Predictive Analytics in 2 Days - New Oracle University Course!

What you will learn:  This Predictive Analytics using Oracle Data Mining Ed 1 training will review the basic concepts of data mining. Expert Oracle University instructors will teach you how to leverage the predictive analytical power of Oracle Data Mining, a component of the Oracle Advanced Analytics option.

Learn To:

  • Explain basic data mining concepts and describe the benefits of predictive analysis.
  • Understand primary data mining tasks, and describe the key steps of a data mining process.
  • Use the Oracle Data Miner to build, evaluate, apply, and deploy multiple data mining models.
  • Use Oracle Data Mining's predictions and insights to address many kinds of business problems.
  • Deploy data mining models for end-user access, in batch or real-time, and within applications.

Benefits to You

When you've completed this course, you'll be able to use the Oracle Data Miner 4.1, the Oracle Data Mining "workflow" GUI, which enables data analysts to work directly with data inside the database. The Data Miner GUI provides intuitive tools that help you explore the data graphically, build and evaluate multiple data mining models, apply Oracle Data Mining models to new data, and deploy Oracle Data Mining's predictions and insights throughout the enterprise.

Oracle Data Miner's SQL APIs - Get Results in Real-Time

Oracle Data Miner's SQL APIs automatically mine Oracle data and deploy results in real-time. Because the data, models, and results remain in the Oracle Database, data movement is eliminated, security is maximized and information latency is minimized.

Introduction

  • Course Objectives
  • Suggested Course Prerequisites
  • Suggested Course Schedule
  • Class Sample Schemas
  • Practice and Solutions Structure
  • Review location of additional resources

Predictive Analytics and Data Mining Concepts

  • What is the Predictive Analytics?
  • Introducting the Oracle Advanced Analytics (OAA) Option?
  • What is Data Mining?
  • Why use Data Mining?
  • Examples of Data Mining Applications
  • Supervised Versus Unsupervised Learning
  • Supported Data Mining Algorithms and Uses

Understanding the Data Mining Process

  • Common Tasks in the Data Mining Process
  • Introducing the SQL Developer interface

Introducing Oracle Data Miner 4.1

  • Data mining with Oracle Database
  • Setting up Oracle Data Miner
  • Accessing the Data Miner GUI
  • Identifying Data Miner interface components
  • Examining Data Miner Nodes
  • Previewing Data Miner Workflows

Using Classification Models

  • Reviewing Classification Models
  • Adding a Data Source to the Workflow
  • Using the Data Source Wizard
  • Using Explore and Graph Nodes
  • Using the Column Filter Node
  • Creating Classification Models
  • Building the Models
  • Examining Class Build Tabs

Using Regression Models

  • Reviewing Regression Models
  • Adding a Data Source to the Workflow
  • Using the Data Source Wizard
  • Performing Data Transformations
  • Creating Regression Models
  • Building the Models
  • Comparing the Models
  • Selecting a Model

Using Clustering Models

  • Describing Algorithms used for Clustering Models
  • Adding Data Sources to the Workflow
  • Exploring Data for Patterns
  • Defining and Building Clustering Models
  • Comparing Model Results
  • Selecting and Applying a Model
  • Defining Output Format
  • Examining Cluster Results

Performing Market Basket Analysis

  • What is Market Basket Analysis?
  • Reviewing Association Rules
  • Creating a New Workflow
  • Adding a Data Source to the Workflow
  • Creating an Association Rules Model
  • Defining Association Rules
  • Building the Model
  • Examining Test Results

Performing Anomaly Detection

  • Reviewing the Model and Algorithm used for Anomaly Detection
  • Adding Data Sources to the Workflow
  • Creating the Model
  • Building the Model
  • Examining Test Results
  • Applying the Model
  • Evaluating Results

Mining Structured and Unstructured Data

  • Dealing with Transactional Data
  • Handling Aggregated (Nested) Data
  • Joining and Filtering data
  • Enabling mining of Text
  • Examining Predictive Results

Using Predictive Queries

  • What are Predictive Queries?
  • Creating Predictive Queries
  • Examining Predictive Results

Deploying Predictive models

  • Requirements for deployment
  • Deployment Options
  • Examining Deployment Options

Friday Aug 07, 2015

Oracle Advanced Analytics Oracle University (OU) Classes in Cambridge, MA. September 28-Oct. 1, 2015

Oracle University has rescheduled their 2 day back to back Oracle Advanced Analytics OU Classes in Cambridge, MA.   Please help spread the word. 

Oracle Advanced Analytics combo-course (ODM + ORE) training

This is great opportunity for big data analytics customers and partners to learn hands on about using Oracle Advanced Analytics.  Vlamis, authorized OU instructor(s), will be teaching the OAA/ODM & OAA/ORE courses again and have been a great and knowledgeable OAA training and implementation partner. The courses are also during the week of Predictive Analytics World in Boston (Oracle will be exhibiting and speaking) so perhaps a good time for customers to come to Boston, perhaps use some OU credits, learn some new skills and focus on Oracle’s predictive analytics. 

Anyone (customers and Oracle Employees) can register through us at http://www.vlamis.com/training/ or via their normal OU connections. They should be able to utilize OU training credits for either course.  Oracle Employees should register through the Employee Self Service from Self Service Applications

Please forward to any appropriate Oracle Advanced Analytics customers and partners.  Thanks!

Charlie 

Wednesday Feb 26, 2014

How to generate training and test dataset using SQL Query node in Data Miner

Overview

In Data Miner, the Classification and Regression Build nodes include a process that splits the input dataset into training and test dataset internally, which are then used by the model build and test processes within the nodes. This internal data split feature alleviates user from performing external data split, and then tie the split dataset into a build and test process separately as found in other competitive products. However, there are times user may want to perform an external data split. For example, user may want to generate a single training and test dataset, and reuse them in multiple workflows. The generation of training and test dataset can be done easily via the SQL Query node.

Stratified Split

The stratified split is used internally by the Classification Build node, because this technique can preserve the categorical target distribution in the resulting training and test dataset, which is important for the classification model build. The following shows the SQL statements that are essentially used by the Classification Build node to produce the training and test dataset internally:

SQL statement for Training dataset

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by {target column} ORDER BY ORA_HASH({case id column})) "_partition_caseid"

FROM {input data} a

) v1,

(

-- get the count of subgroups based on target classes

SELECT {target column},

COUNT(*) "_partition_target_cnt"

FROM {input data} GROUP BY {target column}

) v2

WHERE v1. {target column} = v2. {target column}

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) <= (v2."_partition_target_cnt" * {percent of training dataset} / 100)


SQL statement for Test dataset

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by {target column} ORDER BY ORA_HASH({case id column})) "_partition_caseid"

FROM {input data} a

) v1,

(

-- get the count of subgroups based on target classes

SELECT {target column},

COUNT(*) "_partition_target_cnt"

FROM {input data} GROUP BY {target column}

) v2

WHERE v1. {target column} = v2. {target column}

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) > (v2."_partition_target_cnt" * {percent of training dataset} / 100)

The followings describe the placeholders used in the SQL statements:

{target column} - target column. It must be categorical type.

{case id column} - case id column. It must contain unique numbers that identify the rows.

{input data} - input data set.

{percent of training dataset} - percent of training dataset. For example, if you want to split 60% of input dataset into training dataset, use the value 60. The test dataset will contain 100%-60% = 40% of the input dataset. The training and test dataset are mutually exclusive.

Random Split

The random split is used internally by the Regression Build node because the target is usually numerical type. The following shows the SQL statements that are essentially used by the Regression Build node to produce the training and test dataset:

SQL statement for Training dataset

SELECT

v1.*

FROM

{input data} v1

WHERE ORA_HASH({case id column}, 99, 0) <= {percent of training dataset}

SQL statement for Test dataset

SELECT

    v1.*

FROM

{input data} v1

WHERE ORA_HASH({case id column}, 99, 0) > {percent of training dataset}

The followings describe the placeholders used in the SQL statements:

{case id column} - case id column. It must contain unique numbers that identify the rows.

{input data} - input data set.

{percent of training dataset} - percent of training dataset. For example, if you want to split 60% of input dataset into training dataset, use the value 60. The test dataset will contain 100%-60% = 40% of the input dataset. The training and test dataset are mutually exclusive.

Use SQL Query node to create training and test dataset

Assume you want to create the training and test dataset out of the demo INSUR_CUST_LTV_SAMPLE dataset using the stratified split technique; you can create the following workflow to utilize the SQL Query nodes to execute the above split SQL statements to generate the dataset, and then use the Create Table nodes to persist the resulting dataset.

Assume the case id is CUSTOMER_ID, target is BUY_INSURANCE, and the training dataset is 60% of the input dataset. You can enter the following SQL statement to create the training dataset in the “SQL Query Stratified Training” SQL Query node:

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by "BUY_INSURANCE" ORDER BY ORA_HASH("CUSTOMER_ID")) "_partition_caseid"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" a

) v1,

(

-- get the count of subgroups based on target classes

SELECT "BUY_INSURANCE",

COUNT(*) "_partition_target_cnt"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" GROUP BY "BUY_INSURANCE"

) v2

WHERE v1."BUY_INSURANCE" = v2."BUY_INSURANCE"

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) <= (v2."_partition_target_cnt" * 60 / 100)



Likewise, you can enter the following SQL statement to create the test dataset in the “SQL Query Stratified Test” SQL Query node:

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by "BUY_INSURANCE" ORDER BY ORA_HASH("CUSTOMER_ID")) "_partition_caseid"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" a

) v1,

(

-- get the count of subgroups based on target classes

SELECT "BUY_INSURANCE",

COUNT(*) "_partition_target_cnt"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" GROUP BY "BUY_INSURANCE"

) v2

WHERE v1."BUY_INSURANCE" = v2."BUY_INSURANCE"

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) > (v2."_partition_target_cnt" * 60 / 100)

Now run the workflow to create the training and test dataset. You can find the table names of the persisted dataset in the associated Create Table nodes.


Conclusion

This blog shows how easily to create the training and test dataset using the stratified split SQL statements via the SQL Query nodes. Similarly, you can generate the training and test dataset using the random split technique by replacing SQL statements with the random split SQL statements in the SQL Query nodes in the above workflow. If a large dataset (tens of millions of rows) is used in multiple model build nodes, it may be a good idea to split the data ahead of time to optimize the overall processing time (avoid multiple internal data splits inside the model build nodes).

Thursday May 10, 2012

Oracle Data Mining Virtual Classes Scheduled

Two Oracle Data Mining Virtual Classes are now scheduled.  Register for a course in 2 easy steps.

Step 1: Select your Live Virtual Class options

Select


Live Virtual Class
Course ID: D76362GC10
Course Title: Oracle Database 11g: Data Mining Techniques
NEW
Duration: 2 Days
Price: US$ 1,300 Dollars

Step 2: Select the date and location of your Live Virtual Class

Please select a location below then click on the Add to Cart button

i

Location  Duration Class Date Class Start Time Class End Time Course Materials Instruction Language Seats Audience Employees
Online 2 Days 09-Aug-2012 04:00 AM EDT 12:00 PM EDT English English Available Public Employees
Online 2 Days 18-Oct-2012 04:00 AM EDT 12:00 PM EDT English English Available Public Employees

100% Student Satisfaction: Oracle's 100% Student Satisfaction program applies to those publicly scheduled and publicly available Oracle University Instructor Led Training classes that are identified as part of the 100% Student Satisfaction program on the http://www.oracle.com/education website at the time the class is purchased. Oracle will permit unsatisfied students to retake the class, subject to terms and conditions. Customers are not entitled to a refund. For more information and additional terms, conditions and restrictions that apply, click here

Friday Mar 23, 2012

NEW 2-Day Instructor Led Course on Oracle Data Mining Now Available!

UPDATED - See the updated and new Learn Predictive Analytics using Oracle Data Mining 2-day Oracle University Course.

A NEW 2-Day Instructor Led Course on Oracle Data Mining has been developed for customers and anyone wanting to learn more about data mining, predictive analytics and knowledge discovery inside the Oracle Database.  To register interest in attending the class, click here and submit your preferred format.

Course Objectives:

  • Explain basic data mining concepts and describe the benefits of predictive analysis
  • Understand primary data mining tasks, and describe the key steps of a data mining process
  • Use the Oracle Data Miner to build,evaluate, and apply multiple data mining models
  • Use Oracle Data Mining's predictions and insights to address many kinds of business problems, including: Predict individual behavior, Predict values, Find co-occurring events
  • Learn how to deploy data mining results for real-time access by end-users

Five reasons why you should attend this 2 day Oracle Data Mining Oracle University course. With Oracle Data Mining, a component of the Oracle Advanced Analytics Option, you will learn to gain insight and foresight to:

  • Go beyond simple BI and dashboards about the past. This course will teach you about "data mining" and "predictive analytics", analytical techniques that can provide huge competitive advantage
  • Take advantage of your data and investment in Oracle technology
  • Leverage all the data in your data warehouse, customer data, service data, sales data, customer comments and other unstructured data, point of sale (POS) data, to build and deploy predictive models throughout the enterprise.
  • Learn how to explore and understand your data and find patterns and relationships that were previously hidden
  • Focus on solving strategic challenges to the business, for example, targeting "best customers" with the right offer, identifying product bundles, detecting anomalies and potential fraud, finding natural customer segments and gaining customer insight.

About

Everything about Oracle Data Mining, a component of the Oracle Advanced Analytics Option - News, Technical Information, Opinions, Tips & Tricks. All in One Place

Search

Categories
Archives
« July 2016
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
      
Today