Tuesday Mar 18, 2014

Deploy Data Miner Apply Node SQL as RESTful Web Service for Real-Time Scoring

The free Oracle Data Miner GUI is an extension to Oracle SQL Developer that enables data analysts to work directly with data inside the database, explore the data graphically, build and evaluate multiple data mining models, apply Oracle Data Mining models to new data and deploy Oracle Data Mining's predictions and insights throughout the enterprise. The product enables a complete workflow deployment to a production system via generated PL/SQL scripts (See Generate a PL/SQL script for workflow deployment). This time I want to focus on the model scoring side, especially the single record real-time scoring. Would it be nice if the scoring function can be accessed by different systems on different platforms? How about deploying the scoring function as a Web Service? This way any system that can send HTTP request can invoke the scoring Web Service, and consume the returning result as they see fit. For example, you can have a mobile app that collects customer data, and then invokes the scoring Web Service to determine how likely the customer is going to buy a life insurance. This blog shows a complete demo from building predictive models to deploying a scoring function as a Web Service. However, the demo does not take into account of any authentication and security consideration related to Web Services, which is out of the scope of this blog.

Web Services Requirement

This demo uses the Web Services feature provided by the Oracle APEX 4.2 and Oracle REST Data Services 2.0.6 (formerly Oracle APEX Listener). Here are the installation instructions for both products:

For 11g Database

Go to the Oracle Application Express Installation Guide and following the instructions below:

1.5.1 Scenario 1: Downloading from OTN and Configuring the Oracle Application Express Listener

· Step 1: Install the Oracle Database and Complete Pre-installation Tasks

· Step 2: Download and Install Oracle Application Express

· Step 3: Change the Password for the ADMIN Account

· Step 4: Configure RESTful Services

· Step 5: Restart Processes

· Step 6: Configure APEX_PUBLIC_USER Account

· Step 7: Download and Install Oracle Application Express Listener

· Step 8: Enable Network Services in Oracle Database 11g

· Step 9: Security Considerations

· Step 10: About Developing Oracle Application Express in Other Languages

· Step 11: About Managing JOB_QUEUE_PROCESSES

· Step 12: Create a Workspace and Add Oracle Application Express Users


For 12c Database

Go to Oracle Application Express Installation Guide (Release 4.2 for Oracle Database 12c) and following the instructions below:

4.4 Installing from the Database and Configuring the Oracle Application Express Listener

· Install the Oracle Database and Complete Preinstallation Tasks

· Download and Install Oracle Application Express Listener

· Configure RESTful Services

· Enable Network Services in Oracle Database 12c

· Security Considerations

· About Running Oracle Application Express in Other Languages

· About Managing JOB_QUEUE_PROCESSES

· Create a Workspace and Add Oracle Application Express Users


Note: The APEX is pre-installed with the Oracle database 12c, but you need to configure it in order to use it.

For this demo, create a Workspace called DATAMINER that is based on an existing user account that has already been granted access to the Data Miner (this blog assumes DMUSER is the Data Miner user account). Please refer to the Oracle By Example Tutorials to review how to create a Data Miner user account and install the Data Miner Repository. In addition, you need to create an APEX user account (for simplicity I use DMUSER).

Build Models to Predict BUY_INSURANCE

This demo uses the demo data set, INSUR_CUST_LTV_SAMPLE, that comes with the Data Miner installation. Now, let’s use the Classification Build node to build some models using the CUSTOMER_ID as the case id and BUY_INSURANCE as the target.

Evaluate the Models

Nice thing about the Build node is that it builds a set of models with different algorithms within the same mining function by default, so we can select the best model to use. Let’s look at the models in the Test Viewer; here we can compare the models by looking at their Predictive Confidence, Overall Accuracy, and Average Accuracy values. Basically, the model with the highest values across these three metrics is the good one to use. As you can see, the winner here is the CLAS_DT_3_6 decision tree model.

Next, let’s see what input data columns are used as predictors for the decision tree model. You can find that information in the Model Viewer below. Surprisingly, it only uses a few columns for the prediction. These columns will be our input data requirement for the scoring function, the rest of the input columns can be ignored.


Score the Model

Let’s complete the workflow with an Apply node, from which we will generate the scoring SQL statement to be used for the Web Service. Here we reuse the INSUR_CUST_LTV_SAMPLE data as input data to the Apply node, and select only the required columns as found in the previous step. Also, in the Class Build node we deselect the other models as output in the Property Inspector (Models tab), so that only decision tree model will be used for the Apply node. The generated scoring SQL statement will use only the decision tree model to score against the limited set of input columns.

Generate SQL Statement for Scoring

After the workflow is run successfully, we can generate the scoring SQL statement via the “Save SQL” context menu off the Apply node as shown below.

Here is the generated SQL statement:

/* SQL Deployed by Oracle SQL Developer 4.1.0.14.78 from Node "Apply", Workflow "workflow score", Project "project", Connection "conn_12c" on Mar 16, 2014 */
ALTER SESSION set "_optimizer_reuse_cost_annotations"=false;
ALTER SESSION set NLS_NUMERIC_CHARACTERS=".,";
--ALTER SESSION FOR OPTIMIZER
WITH
/* Start of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
"N$10013" as (select /*+ inline */ "INSUR_CUST_LTV_SAMPLE"."BANK_FUNDS",
"INSUR_CUST_LTV_SAMPLE"."CHECKING_AMOUNT",
"INSUR_CUST_LTV_SAMPLE"."CREDIT_BALANCE",
"INSUR_CUST_LTV_SAMPLE"."N_TRANS_ATM",
"INSUR_CUST_LTV_SAMPLE"."T_AMOUNT_AUTOM_PAYMENTS"
from "DMUSER"."INSUR_CUST_LTV_SAMPLE" )
/* End of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
,
/* Start of sql for node: Apply */
"N$10011" as (SELECT /*+ inline */
PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PRED",
PREDICTION_PROBABILITY("DMUSER"."CLAS_DT_3_6", PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) USING *) "CLAS_DT_3_6_PROB",
PREDICTION_COST("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PCST"
FROM "N$10013" )
/* End of sql for node: Apply */
select * from "N$10011";

We need to modify the first SELECT SQL statement to change the data source from a database table to a record that can be constructed on the fly, which is crucial for real-time scoring. The bind variables (e.g. :funds) are used; these variables will be replaced with actual data (passed in by the Web Service request) when the SQL statement is executed.

/* SQL Deployed by Oracle SQL Developer 4.1.0.14.78 from Node "Apply", Workflow "workflow score", Project "project", Connection "conn_12c" on Mar 16, 2014 */
WITH
/* Start of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
"N$10013" as (select /*+ inline */
:funds "BANK_FUNDS",
:checking "CHECKING_AMOUNT",
:credit "CREDIT_BALANCE",
:atm "N_TRANS_ATM",
:payments "T_AMOUNT_AUTOM_PAYMENTS"
from DUAL
)
/* End of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
,
/* Start of sql for node: Apply */
"N$10011" as (SELECT /*+ inline */
PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PRED",
PREDICTION_PROBABILITY("DMUSER"."CLAS_DT_3_6", PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) USING *) "CLAS_DT_3_6_PROB",
PREDICTION_COST("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PCST"
FROM "N$10013" )
/* End of sql for node: Apply */
select * from "N$10011";

Create Scoring Web Service

Assume the Oracle APEX and Oracle REST Data Services have been properly installed and configured; we can proceed to create a RESTful web service for real-time scoring. The followings describe the steps to create the Web Service in APEX:

1. APEX Login

You can bring up the APEX login screen by pointing your browser to http://<host>:<port>/ords. Enter your Workspace name and account info to login. The Workspace should be based on the Data Miner DMUSER account for this demo to work.

2. Select SQL Workshop

Select the SQL Workshop icon to proceed.

3. Select RESTful Services

Select the RESTful Services to create the Web Service.

Click the “Create” button to continue.

4. Define Restful Services

Enter the following information to define the scoring Web Service in the RESTful Services Module form:

Name: buyinsurance

URI Prefix: score/

Status: Published

URI Template: buyinsurance?funds={funds}&checking={checking}&credit={credit}&atm={atm}&payments={payments}

Method: GET

Source Type: Query Format: CSV

Source:

/* SQL Deployed by Oracle SQL Developer 4.1.0.14.78 from Node "Apply", Workflow "workflow score", Project "project", Connection "conn_11204" on Mar 16, 2014 */
WITH
/* Start of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
"N$10013" as (select /*+ inline */
:funds "BANK_FUNDS",
:checking "CHECKING_AMOUNT",
:credit "CREDIT_BALANCE",
:atm "N_TRANS_ATM",
:payments "T_AMOUNT_AUTOM_PAYMENTS"
from DUAL
)
/* End of sql for node: INSUR_CUST_LTV_SAMPLE APPLY */
,
/* Start of sql for node: Apply */
"N$10011" as (SELECT /*+ inline */
PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PRED",
PREDICTION_PROBABILITY("DMUSER"."CLAS_DT_3_6", PREDICTION("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) USING *) "CLAS_DT_3_6_PROB",
PREDICTION_COST("DMUSER"."CLAS_DT_3_6" COST MODEL USING *) "CLAS_DT_3_6_PCST"
FROM "N$10013" )
/* End of sql for node: Apply */
select * from "N$10011";

Note: JSON output format is supported.

Lastly, create the following parameters that are used to pass the data from the Web Service request (URI) to the bind variables used in the scoring SQL statement.

The final RESTful Services Module definition should look like the following. Make sure the “Requires Secure Access” is set to “No” (HTTPS secure request is not addressed in this demo).

Test the Scoring Web Service

Let’s create a simple web page using your favorite HTML editor (I use JDeveloper to create this web page). The page includes a form that is used to collect customer data, and then fires off the Web Service request upon submission to get a prediction and associated probability.

Here is the HTML source of the above Form:

<!DOCTYPE html>

<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

<title>score</title>

</head>

<body>

<h2>

Determine if Customer will Buy Insurance

</h2>

<form action="http://localhost:8080/ords/dataminer/score/buyinsurance" method="get">

<table>

<tr>

<td>Bank Funds:</td>

<td><input type="text" name="funds"/></td>

</tr>

<tr>

<td>Checking Amount:</td>

<td><input type="text" name="checking"/></td>

</tr>

<tr>

<td>Credit Balance:</td>

<td><input type="text" name="credit"/></td>

</tr>

<tr>

<td>Number ATM Transactions:</td>

<td><input type="text" name="atm"/></td>

</tr>

<tr>

<td>Amount Auto Payments:</td>

<td><input type="text" name="payments"/></td>

</tr>

<tr>

<td colspan="2" align="right">

<input type="submit" value="Score"/>

</td>

</tr>

</table>

</form>

</body>
</html>

When the Score button is pressed, the form sends a GET HTTP request to the web server with the collected form data as name-value parameters encoded in the URL.

checking=%7bchecking%7d&credit=%7bcredit%7d&atm=%7batm%7d&payments=%7bpayments%7d">http://localhost:8080/ords/dataminer/score/buyinsurance?funds={funds}&checking={checking}&credit={credit}&atm={atm}&payments={payments}

Notice the {funds}, {checking}, {credit}, {atm}, {payments} will be replaced with actual data from the form. This URI matches the URI Template specified in the RESTful Services Module form above.

Let’s test out the scoring Web Service by entering some values in the form and hit the Score button to see the prediction.

The prediction along with its probability and cost is returned as shown below. Unfortunately, this customer is less likely to buy insurance.

Let’s change some values and see if we have any luck.

Bingo! This customer is more likely to buy insurance.

Conclusion

This blog shows how to deploy Data Miner generated scoring SQL as Web Service, which can be consumed by different systems on different platforms from anywhere. In theory, any SQL statement generated from the Data Miner node could potentially be made as Web Services. For example, we can have a Web Service that returns Model Details info, and this info can be consumed by some BI tool for application integration purpose.

Wednesday Feb 26, 2014

How to generate training and test dataset using SQL Query node in Data Miner

Overview

In Data Miner, the Classification and Regression Build nodes include a process that splits the input dataset into training and test dataset internally, which are then used by the model build and test processes within the nodes. This internal data split feature alleviates user from performing external data split, and then tie the split dataset into a build and test process separately as found in other competitive products. However, there are times user may want to perform an external data split. For example, user may want to generate a single training and test dataset, and reuse them in multiple workflows. The generation of training and test dataset can be done easily via the SQL Query node.

Stratified Split

The stratified split is used internally by the Classification Build node, because this technique can preserve the categorical target distribution in the resulting training and test dataset, which is important for the classification model build. The following shows the SQL statements that are essentially used by the Classification Build node to produce the training and test dataset internally:

SQL statement for Training dataset

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by {target column} ORDER BY ORA_HASH({case id column})) "_partition_caseid"

FROM {input data} a

) v1,

(

-- get the count of subgroups based on target classes

SELECT {target column},

COUNT(*) "_partition_target_cnt"

FROM {input data} GROUP BY {target column}

) v2

WHERE v1. {target column} = v2. {target column}

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) <= (v2."_partition_target_cnt" * {percent of training dataset} / 100)


SQL statement for Test dataset

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by {target column} ORDER BY ORA_HASH({case id column})) "_partition_caseid"

FROM {input data} a

) v1,

(

-- get the count of subgroups based on target classes

SELECT {target column},

COUNT(*) "_partition_target_cnt"

FROM {input data} GROUP BY {target column}

) v2

WHERE v1. {target column} = v2. {target column}

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) > (v2."_partition_target_cnt" * {percent of training dataset} / 100)

The followings describe the placeholders used in the SQL statements:

{target column} - target column. It must be categorical type.

{case id column} - case id column. It must contain unique numbers that identify the rows.

{input data} - input data set.

{percent of training dataset} - percent of training dataset. For example, if you want to split 60% of input dataset into training dataset, use the value 60. The test dataset will contain 100%-60% = 40% of the input dataset. The training and test dataset are mutually exclusive.

Random Split

The random split is used internally by the Regression Build node because the target is usually numerical type. The following shows the SQL statements that are essentially used by the Regression Build node to produce the training and test dataset:

SQL statement for Training dataset

SELECT

v1.*

FROM

{input data} v1

WHERE ORA_HASH({case id column}, 99, 0) <= {percent of training dataset}

SQL statement for Test dataset

SELECT

    v1.*

FROM

{input data} v1

WHERE ORA_HASH({case id column}, 99, 0) > {percent of training dataset}

The followings describe the placeholders used in the SQL statements:

{case id column} - case id column. It must contain unique numbers that identify the rows.

{input data} - input data set.

{percent of training dataset} - percent of training dataset. For example, if you want to split 60% of input dataset into training dataset, use the value 60. The test dataset will contain 100%-60% = 40% of the input dataset. The training and test dataset are mutually exclusive.

Use SQL Query node to create training and test dataset

Assume you want to create the training and test dataset out of the demo INSUR_CUST_LTV_SAMPLE dataset using the stratified split technique; you can create the following workflow to utilize the SQL Query nodes to execute the above split SQL statements to generate the dataset, and then use the Create Table nodes to persist the resulting dataset.

Assume the case id is CUSTOMER_ID, target is BUY_INSURANCE, and the training dataset is 60% of the input dataset. You can enter the following SQL statement to create the training dataset in the “SQL Query Stratified Training” SQL Query node:

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by "BUY_INSURANCE" ORDER BY ORA_HASH("CUSTOMER_ID")) "_partition_caseid"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" a

) v1,

(

-- get the count of subgroups based on target classes

SELECT "BUY_INSURANCE",

COUNT(*) "_partition_target_cnt"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" GROUP BY "BUY_INSURANCE"

) v2

WHERE v1."BUY_INSURANCE" = v2."BUY_INSURANCE"

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) <= (v2."_partition_target_cnt" * 60 / 100)



Likewise, you can enter the following SQL statement to create the test dataset in the “SQL Query Stratified Test” SQL Query node:

SELECT

v1.*

FROM

(

-- randomly divide members of the population into subgroups based on target classes

SELECT a.*,

row_number() OVER (partition by "BUY_INSURANCE" ORDER BY ORA_HASH("CUSTOMER_ID")) "_partition_caseid"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" a

) v1,

(

-- get the count of subgroups based on target classes

SELECT "BUY_INSURANCE",

COUNT(*) "_partition_target_cnt"

FROM "INSUR_CUST_LTV_SAMPLE_N$10009" GROUP BY "BUY_INSURANCE"

) v2

WHERE v1."BUY_INSURANCE" = v2."BUY_INSURANCE"

-- random sample subgroups based on target classes in respect to the sample size

AND ORA_HASH(v1."_partition_caseid", v2."_partition_target_cnt"-1, 0) > (v2."_partition_target_cnt" * 60 / 100)

Now run the workflow to create the training and test dataset. You can find the table names of the persisted dataset in the associated Create Table nodes.


Conclusion

This blog shows how easily to create the training and test dataset using the stratified split SQL statements via the SQL Query nodes. Similarly, you can generate the training and test dataset using the random split technique by replacing SQL statements with the random split SQL statements in the SQL Query nodes in the above workflow. If a large dataset (tens of millions of rows) is used in multiple model build nodes, it may be a good idea to split the data ahead of time to optimize the overall processing time (avoid multiple internal data splits inside the model build nodes).

Friday Feb 14, 2014

dunnhumby Accelerates Complex Segmentation Queries from Weeks to Minutes—Gains Competitive Advantage

See original story on http://www.oracle.com/us/corporate/customers/customersearch/dunnhumby-1-exadata-ss-2137635.html

dunnhumby Accelerates Complex Segmentation Queries from Weeks to Minutes—Gains Competitive Advantage

dunnhumby is the world’s leading customer-science company. It analyzes customer data and applies insights from more than 400 million customers across the globe to create better customer experiences and build loyalty. With its unique analytical capabilities, dunnhumby helps retailers better serve customers, create a competitive advantage, and enjoy sustained growth.


Challenges

A word from dunnhumby Ltd.

  • “Oracle Exadata Database Machine is helping us to transform our business and improve our competitive edge. We can now complete queries that took weeks in just minutes—driving new product offerings, more competitive bids, and more accurate analyses based on 100% of data instead of just a sampling.” – Chris Wones, Director of Data Solutions, dunnhumby USA

  • Expand breadth of services to maintain a competitive advantage in the customer-science industry
  • Provide clients, including major retail organizations in the United Kingdom and North America, with expanded historical and real-time insight into customer behavior, buying tendencies, and response to promotional campaigns and product offerings
  • Ensure competitive pricing for the company’s customer-analysis services while delivering added value to a growing client base
  • Analyze growing volumes of data rapidly and comprehensively
  • Ensure the security of sensitive information, including protected personal information to reduce risk and support compliance
  • Protect against data loss and reduce the backup and recovery window, as data is crucial to the competitive advantage and success of the business
  • Optimize IT investment and performance across the technology-intensive business
  • Reduce licensing and maintenance costs of previous analytical and data warehouse software

Solutions

  • Deployed Oracle Exadata Database Machine and accelerated queries that previously took two-to-three weeks to just minutes, enabling the company to bid on more complex, custom analyses and gain a competitive advantage
  • Achieved 4x to 30x more data compression using Hybrid Columnar Compression and Oracle Advanced Compression across sets—reducing storage requirements, increasing analysis and backup performance, and optimizing IT investment
  • Consolidated data marts securely with data warehouse schemas in Oracle Exadata, enabling extremely faster presummarizations of large volumes of data
  • Accelerated analytic capabilities to near real time using Oracle Advanced Analytics and third-party tools, enabling analysis of unstructured big data from emerging sources, like smart phones
  • Accelerated segmentation and customer-loyalty analysis from one week to just four hours—enabling the company to deliver more timely information as well as finer-grained analysis
  • Improved analysts’ productivity and focus as they can now run queries and complete analysis without having to wait hours or days for a query to process
  • Generated more accurate business insights and marketing recommendations with the ability to analyze 100% of data—including years of historical data—instead of just a small sample
  • Improved accuracy of marketing recommendations by analyzing larger sample sizes and predicting the market’s reception to new product ideas and strategies
  • Improved secure processing and management of 60 TB of data, growing at a rate of 500 million customer records a week, including information from clients’ customer loyalty programs 
  • Ensured data security and compliance with requirements for safeguarding protected personal information and reduced risk with Oracle Advanced Security, Oracle Directory Services Plus, and Oracle Enterprise Single Sign-On Suite Plus
  • Gained high-performance identity virtualization, storage, and synchronization services that meet the needs of the company’s high-volume environment
  • Ensured performance scalability even with concurrent queries with Oracle Exadata, which demonstrated higher throughput than competing solutions under such conditions
  • Deployed integrated backup and recovery using Oracle’s Sun ZFS Backup Appliance—to support high performance and continuous availability and act as a staging area for both inbound and outbound extract, transform, and load processes

Why Oracle

dunnhumby considered Teradata, IBM Netezza, and other solutions, and selected Oracle Exadata for its ability to sustain high performance and throughput even during concurrent queries. “We needed to see how the system performed when scaled to multiple concurrent queries, and Oracle Exadata’s throughput was much higher than competitive offerings,” said Chris Wones, director of data solutions, dunnhumby, USA.

Implementation Process

dunnhumby began its Oracle Exadata implementation in September 2012 and went live in April 2013. It has installed four Oracle Exadata machine units in the United States and four in the United Kingdom. The company is using three of the four machines in each country as production environments and one machine in each country for development and testing. dunnhumby runs an active-active environment across its Oracle Exadata clusters to ensure high availability.

Monday Feb 03, 2014

How to generate Scatterplot Matrices using R script in Data Miner

Data Miner provides Explorer node that produces descriptive statistical data and histogram graph, which allows analyst to analyze input data columns individually. Often time an analyst is interested in analyzing the relationships among the data columns, so that he can choose the columns that are closely correlated to the target column for model build purpose. To examine relationships among data columns, he can create scatter plots using the Graph node.

For example, an analyst may want to build a regression model that predicts the customer LTV (long term value) using the INSUR_CUST_LTV_SAMPLE demo data. Before building the model, he can create the following workflow with the Graph node to examine the relationships between interested data columns and the LTV target column.

In the Graph node editor, create a scatter plot with an interested data column (X Axis) against the LTV target column (Y Axis). For the demo, let’s create three scatter plots using these data columns: HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT.

Here are the scatter plots generated by the Graph node. As you can see the HOUSE_OWNERSHIP and N_MORTGAGES are quite positively correlated to the LTV target column. However, the MORTGAGE_AMOUNT seems less correlated to the LTV target column.

The problem with the above approach is it is laborious to create scatter plots one by one and you cannot examine relationships among those data columns themselves. To solve the problem, we can create a Scatterplot matrix graph as the following:

This is a 4 x4 scatterplot matrix of data column LTV, HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT. In the top row, you can examine the relationships between HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT against the LTV target column. In the second row, you can examine the relationships between LTV, N_MORTGAGES, and MORTGAGE_AMOUNT against the HOUSE_OWNERSHIP column. In the third and forth rows, you can examine the relationships of other columns against the N_MORTGAGES, and MORTGAGE_AMOUNT respectively.

To generate this scatterplot matrix, we need to invoke the readily available R script RQG$pairs (via the SQL Query node) in the Oracle R Enterprise. Please refer to http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/index.html?ssSourceSiteId=ocomen for Oracle R Enterprise installation.

Let’s create the following workflow with the SQL Query node to invoke the R script. Note: a Sample node may be needed to sample down the data size (e.g. 1000 rows) for large data set before it is used for charting.

Enter the following SQL statement in the SQL Query editor. The rqTableEval is a R SQL function that allows user to invoke R script from the SQL side. The first SELECT statement within the function specifies the input data (LTV, HOUSE_OWNERSHIP, N_MORTGAGES, and MORTGAGE_AMOUNT). The second SELECT statement specifies the optional parameter to the R script, where we define the graph title “Scatterplot Matrices”. The output of the function is an XML document with the graph data embedded in it.

SELECT VALUE FROM TABLE
(
rqTableEval(
cursor(select "INSUR_CUST_LTV_SAMPLE_N$10001"."LTV",
"INSUR_CUST_LTV_SAMPLE_N$10001"."HOUSE_OWNERSHIP",
"INSUR_CUST_LTV_SAMPLE_N$10001"."N_MORTGAGES",
"INSUR_CUST_LTV_SAMPLE_N$10001"."MORTGAGE_AMOUNT"
from "INSUR_CUST_LTV_SAMPLE_N$10001"), -- Input Cursor
cursor(select 'Scatterplot Matrices' as MAIN from DUAL), -- Param Cursor
'XML', -- Output Definition
'RQG$pairs' -- R Script
)
)

You can see what default R scripts are available in the R Scripts tab. This tab is visible only when the Oracle R Enterprise installation is detected.

Click the button in the toolbar to invoke the R script to produce the Scatterplot matrix below.

You can copy the Scatterplot matrix image to a clipboard or save it to an image file (PNG) for reporting purpose. To do so, right click on the graph to bring up the pop-up menu below.

The Scatterplot matrix is also available in the Data Viewer of the SQL Query node. To open the Data Viewer, select the “View Data” item in the pop-up menu of the node.

The returning XML data is shown in the Data Viewer as shown below. To view the Scatterplot matrix embedded in the data, click on the XML data to bring up the icon in the far right of the cell, and then click on the icon to bring up the viewer.

Tuesday Jan 14, 2014

How to export data from the Explore Node using Data Miner and SQL Developer

Blog posting by Denny Wong, Principal Member of Technical Staff, User Interfaces and Components, Oracle Data Mining Development

The Explorer node generates descriptive statistical data and histogram data for all input table columns.  These statistical and histogram data may help user to analyze the input data to determine if any action (e.g. transformation) is needed before using it for data mining purpose.  An analyst may want to export this data to a file for offline analysis (e.g. Excel) or reporting purpose.  The Explorer node generates this data to a database table specified in the Output tab of the Property Inspector.  In this case, the data is generated to a table named “OUTPUT_1_2”.


To export the table to a file, we can use the SQL Developer Export wizard. Go to the Connections tab in the Navigator Window, search for the table “OUTPUT_1_2” within the proper connection, then bring up the pop-up menu off the table. Click on the Export menu to launch the Export Wizard.


In the wizard, uncheck the “Export DDL” and select the “Export Data” option since we are only interested in the data itself. In the Format option, select “excel” in this example (a dozen of output formats are supported) and specify the output file name. Upon wizard finish, an excel file is generated.


Let’s open the file to examine what is in it. As expected, it contains all statistical data for all input columns. The histogram data is listed as the last column (HISTOGRAMS), and it has this ODMRSYS.ODMR_HISTOGRAMS structure.


For example, let’s take a closer look at the histogram data for the BUY_INSURANCE column:

ODMRSYS.ODMR_HISTOGRAMS(ODMRSYS.ODMR_HISTOGRAM_POINT('"BUY_INSURANCE"',''No'',NULL,NULL,73.1),ODMRSYS.ODMR_HISTOGRAM_POINT('"BUY_INSURANCE"',''Yes'',NULL,NULL,26.9))

This column contains an ODMRSYS.ODMR_HISTOGRAMS object which is an array of ODMRSYS.ODMR_HISTOGRAM_POINT structure. We can describe the structure to see what is in it.


The ODMRSYS.ODMR_HISTOGRAM_POINT contains five attributes, which represent the histogram data. The ATTRIBUTE_NAME contains the attribute name (e.g. BUY_INSURANCE), the ATTRIBUTE_VALUE contains the attribute values (e.g. No, Yes), the GROUPING_ATTRIBUTE_NAME and GROUPING_ ATTRIBUTE_VALUE are not used (these fields are used when the Group By option is specified), and the ATTRIBUTE_PERCENT contains the percents (e.g. 73.1, 26.9) for the attribute values respectively.


As you can see the ODMRSYS.ODMR_HISTOGRAMS complex output format may be difficult to read and it may require some processing before the data can be used. Alternatively, we can “unnest” the histogram data to transactional data format before exporting it. This way we don’t have to deal with the complex array structure, thus the data is more consumable. To do that, we can write a simple SQL query to “unnest” the data and use the new SQL Query node (Extract histogram data) to run this query (see below). We then use a Create Table node (Explorer output table) to persist the “unnested” histogram data along with the statistical data.

1. Create a SQL Query node

Create a SQL Query node and connect the “Explore Data” node to it. You may rename the SQL Query node to “Extract histogram data” to make it clear it is used to “unnest” the histogram data.

2. Specify a SQL query to “unnest” histogram data

Double click the “Extract histogram data” node to bring up the editor, enter the following SELECT statement in the editor:

SELECT
    "Explore Data_N$10002"."ATTR",
    "Explore Data_N$10002"."AVG",
    "Explore Data_N$10002"."DATA_TYPE",
    "Explore Data_N$10002"."DISTINCT_CNT",
    "Explore Data_N$10002"."DISTINCT_PERCENT",
    "Explore Data_N$10002"."MAX",
    "Explore Data_N$10002"."MEDIAN_VAL",
    "Explore Data_N$10002"."MIN",
    "Explore Data_N$10002"."MODE_VALUE",
    "Explore Data_N$10002"."NULL_PERCENT",
    "Explore Data_N$10002"."STD",
    "Explore Data_N$10002"."VAR",
    h.ATTRIBUTE_VALUE,
    h.ATTRIBUTE_PERCENT
FROM
    "Explore Data_N$10002", TABLE("Explore Data_N$10002"."HISTOGRAMS") h

Click OK to close the editor. This query is used to extract out the ATTRIBUTE_VALUE and ATTRIBUTE_PERCENT fields from the ODMRSYS.ODMR_HISTOGRAMS nested object.

Note: you may select only columns that contain the statistics you are interested in.  The "Explore Data_N$10002" is a generated unique name reference to the Explorer node, you may have a slightly different name ending with some other unique number. 

The query produces the following output.  The last two columns are the histogram data in transactional format.

3. Create a Create Table node to persist the “unnested” histogram data

Create a Create Table node and connect the “Extract histogram data” node to it. You may rename the Create Table node to “Explorer output table” to make it clear it is used to persist the “unnested” histogram data.


4. Export “unnested” histogram data to Excel file

Run the “Explorer output table” node to persist the “unnested” histogram data to a table. The name of the output table (OUTPUT_3_4) can be found in the Property Inspector below.


Next, we can use the SQL Developer Export wizard as described above to export the table to an Excel file. As you can see the histogram data are now in transactional format; they are more readable and can readily be consumed.


Tuesday Dec 31, 2013

Oracle BIWA Summit 2014 January 14-16, 2014 at Oracle HQ in Redwood Shores, CA


Oracle Business Intelligence, Warehousing & Analytics Summit - Redwood City

Oracle is a proud sponsor of the Business Intelligence, Warehousing & Analytics (BIWA) Summit happening January 14 – 16 at the Oracle Conference Center in Redwood City. The Oracle BIWA Summit brings together Oracle ACE experts, customers who are currently using or planning to use Oracle BI, Warehousing and Analytics products and technologies, partners and Oracle Product Managers, Support Personnel and Development Managers. Join us on Tuesday, January 14 at 5 p.m. to hear featured speaker Balaji Yelamanchili, Senior Vice President Analytics and Performance Management Products, for his keynote: Oracle Business Intelligence -- Innovate Faster. Visit the BIWA site http://www.biwasummit.com/ for more information today.

 Among the approximately 50 technical presentations, featured talks a Hands on Labs, I'll be delivering a presentation on Oracle Advanced Analytics and a Hands on Lab on using the OAA/Oracle Data Miner GUI.  

 AA-1010 BEST PRACTICES FOR IN-DATABASE ANALYTICS

Session ID: AA-1010

Presenter: Charlie Berger, Oracle

Abstract:

In the era of Big Data, enterprises are acquiring increasing volumes and varieties of data from a rapidly growing range of internet, mobile, sensor and other real-time and near real-time sources.  The driving force behind this trend toward Big Data analysis is the ability to use this data for “actionable intelligence” -- to predict patterns and behaviors and to deliver essential information when and where it is needed. Oracle Database uniquely offers a powerful platform to perform this predictive analytics and location analysis with in-database data mining, statistical processing and SQL Analytics.  Oracle Advanced Analytics embeds powerful data mining algorithms and adds enterprise scale open source R to solve problems such as predicting customer behavior, anticipating churn, detecting fraud, market basket analysis and discovering customer segments.  Oracle Data Miner GUI , a new SQL Developer 4.0 Extension, enables business analysts to quickly analyze data and visualize data, build, evaluate and apply predictive models and deploy via SQL scripts sophisticated predictive analytics methodologies—all while keeping the data inside the Oracle Database.  Come learn best practices and customer examples for exploiting Oracle’s scalable, performant and secure in-database analytics capabilities to extract more value and actionable intelligence from your data.

HOL-AA-1008 LEARN TO USE ORACLE ADVANCED ANALYTICS FOR PREDICTIVE ANALYTICS SOLUTIONS

Session ID: HOL-AA-1008

Presenter: Charles Berger, Oracle & Karl Rexer, Rexer Analytics

Abstract:

Big Data;  Bigger Insights!  Oracle Data Mining Release 12c, a component of the Oracle Advanced Analytics database Option, embeds powerful data mining algorithms in the SQL kernel of the Oracle Database for problems such as predicting customer behavior, anticipating churn, identifying up-sell and cross-sell, detecting anomalies and potential fraud, market basket analysis, customer profiling, text mining and retail market basket analysis.  Oracle Data Miner GUI , a new SQL Developer 4.0 Extension, enables business analysts to quickly analyze data and visualize data, build, evaluate and apply predictive models and develop sophisticated predictive analytics methodologies—all while keeping the data inside Oracle Database.  Come see how easily you can discover big insights from your Oracle data and generate SQL scripts for deployment and automation and deploying results into Oracle Business Intelligence (OBIEE) dashboards. 

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Monday Dec 09, 2013

Come See and Test Drive Oracle Advanced Analytics at the BIWA Summit'14, Jan 14-16, 2014

The BIWA Summit '14 January 14-16 at Oracle HQ Conference Center Detailed Agenda is now published.   

Please share with your others by Tweeting, Blogging, Facebook, LinkedIn, Email, etc.!

The BIWA Summit is known for novel and interesting use cases of Oracle Big Data, Exadata, Advanced Analytics/Data Mining, OBIEE, Spatial, Endeca and more!    Opportunities to get hands on experience with products in the Hands on Labs, great customer case studies and talks by Oracle Technical Professionals and Partners.  Meet with technical experts.  Click HERE to read detailed abstracts and speaker profiles. 

Use the SPECIAL DISCOUNT code ORACLE12C and registration is only $199 for the 2.5 day technically focused Oracle user group event.

Charlie  (Oracle Employee Advisor to Oracle BIWA Special Interest User Group)

----


<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Tuesday Nov 12, 2013

Oracle Big Data Learning Library

Click on LEARN BY PRODUCT to view all learning resources.

Oracle Big Data Essentials

Attend this Oracle University Course!

Using Oracle NoSQL Database

Attend this Oracle University class!

Oracle and Big Data on OTN

See the latest resource on OTN.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday Sep 04, 2013

Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into workflow using the SQL Query node

I posted a new white paper authored by Denny Wong, Principal Member of Technical Staff, User Interfaces and Components, Oracle Data Mining Technologies.  You can access the white paper here and the companion files here.  Here is an excerpt:

Oracle Data Miner (Extension of SQL Developer 4.0) 

Integrate Oracle R Enterprise Mining Algorithms into workflow using the SQL Query node

Oracle R Enterprise (ORE), a component of the Oracle Advanced Analytics Option, makes the open source R statistical programming language and environment ready for the enterprise and big data. Designed for problems involving large amounts of data, Oracle R Enterprise integrates R with the Oracle Database. R users can develop, refine and deploy R scripts that leverage the parallelism and scalability of the database to perform predictive analytics and data analysis.

Oracle Data Miner (ODMr) offers a comprehensive set of in-database algorithms for performing a variety of mining tasks, such as classification, regression, anomaly detection, feature extraction, clustering, and market basket analysis. One of the important capabilities of the new SQL Query node in Data Miner 4.0 is a simplified interface for integrating R scripts registered with the database. This provides the support necessary for R Developers to provide useful mining scripts for use by data analysts. This synergy provides many additional benefits as noted below.

· R developers can further extend ODMr mining capabilities by incorporating the extensive R mining algorithms from the open source CRAN packages or leveraging any user developed custom R algorithms via SQL interfaces provided by ORE.

· Since this SQL Query node can be part of a workflow process, R scripts can leverage functionalities provided by other workflow nodes which can simplify the overall effort of integrating R capabilities within the database.

· R mining capabilities can be included in the workflow deployment scripts produced by the new sql script generation feature. So the ability of deploy R functionality within the context of an Data Miner workflow is easily accomplished.

· Data and processing are secured and controlled by the Oracle Database. This alleviates a lot of risk that are incurred by other providers, when users have to export data out of the database in order to perform advanced analytics.

Oracle Advanced Analytics saves analysts, developers, database administrators and management the headache of trying to integrate R and database analytics. Instead, users can quickly gain the benefit of new R analytics and spend their time and effort on developing business solutions instead of building homegrown analytical platforms.

This paper should be very useful to R developers wishing to better understand how to leverage imbedding R Scripts for use by Data Analysts.  Analysts will also find the paper useful to see how R features can be surfaced for their use in Data Miner. The specific use case covered demonstrates how to use the SQL Query node to integrate R glm and rpart regression model build, test, and score operations into the workflow along with nodes that perform data preparation and residual plot graphing. However, the integration process described here can easily be adapted to integrate other R operations like statistical data analysis and advanced graphing to expand ODMr functionalities.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Monday Jul 15, 2013

Oracle Data Miner GUI, part of SQL Developer 4.0 Early Adopter 1 is now available for download on OTN

The NEW Oracle Data Miner GUI, part of SQL Developer 4.0 Early Adopter 1 is now available for download on OTN.  See link to SQL Developer 4.0 EA1.   


The Oracle Data Miner 4.0 New Features are applicable to Oracle Database 11g Release 2 and Oracle Database Release 12c:  See Oracle Data Miner Extension to SQL Developer 4.0 Release Notes for EA1 for additional information  

· Workflow SQL Script Deployment

o Generates SQL scripts to support full deployment of workflow contents

· SQL Query Node

o Integrate SQL queries to transform data or provide a new data source

o Supports the running of R Language Scripts and viewing of R generated data and graphics


· Graph Node

o Generate Line, Scatter, Bar, Histogram and Box Plots



· Model Build Node Improvements

o Node level data usage specification applied to underlying models

o Node level text specifications to govern text transformations

o Displays heuristic rules responsible for excluding predictor columns

o Ability to control the amount of Classification and Regression test results generated

· View Data

o Ability to drill in to view custom objects and nested tables

These new Oracle Data Miner GUI capabilities expose Oracle Database 12c and Oracle Advanced Analytics/Data Mining Release 1 features:

· Predictive Query Nodes

o Predictive results without the need to build models using Analytical Queries

o Refined predictions based on data partitions

· Clustering Node New Algorithm

o Added Expectation Maximization algorithm

· Feature Extraction Node New Algorithms

o Added Singular Value Decomposition and Principal Component Analysis algorithms

· Text Mining Enhancements

o Text transformations integrated as part of Model's Automatic Data Preparation

o Ability to import Build Text node specifications into a Model Build node

· Prediction Result Explanations

o Scoring details that explain predictive result

· Generalized Linear Model New Algorithm Settings

o New algorithm settings provide feature selection and generation

See OAA on OTN pages http://www.oracle.com/technetwork/database/options/advanced-analytics/index.html for more information on Oracle Advanced Analytics.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday May 08, 2013

Oracle Advanced Analytics and Data Mining at the Movies on YouTube

Periodically, I've recorded a demonstration and/or presentation on Oracle Advanced Analytics and Data Mining and have posted them on YouTube.  Here are links to some of more recent YouTube postings--sort of an 
Oracle Advanced Analytics and Data Mining at the Movies experience.

So.... grab your popcorn and a comfortable chair.  Hope you enjoy!

Charlie 

Oracle Advanced Analytics at the Movies

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Thursday Mar 21, 2013

Recorded Webcast: Best Practices using Oracle Advanced Analytics with Oracle Exadata

Best Practices using Oracle Advanced Analytics with Oracle Exadata

 On Demand
Launch Presentation


Join us to learn how Oracle Advanced Analytics extends the Oracle database into a comprehensive advanced analytics platform through two major components, Oracle R Enterprise and Oracle Data Mining. Using these tools with Oracle Exadata Database Machine will allow organizations to perform at their peak and find real business value within their data.

You need to visit this Oracle Exadata Webcast Main page first and submit your registration information.  Then you’ll receive an email so you can view the Webcast.  This is external so you can share with anyone can download the presentation as well.  FYI.  Charlie

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Wednesday Mar 13, 2013

Oracle OpenWorld Call for Proposals now OPEN Submit your Oracle Advanced Analytics/Data Mining/ORE talks today!!

Calling All Oracle OpenWorld Oracle Advanced Analytics, Data Mining and R Experts


The Call for Proposals is open. Have something interesting to present to the world’s largest gathering of Oracle technologists and business leaders? Making breakthrough innovations with Java or MySQL? We want to hear from you, and so do the attendees at this year’s Oracle OpenWorld, JavaOne, and MySQL Connect conferences. Submit your proposal now for a chance to share your expertise at one of the most important technology and business conferences of the year.

CHOOSE...

Select one of Oracle’s premiere conferences

SHARE...

Submit your proposal for sharing your most innovative ideas and experiences

JOIN...

Connect with the elite community of Oracle OpenWorld, JavaOne, and MySQL Connect session leaders in 2013

We recommend you take the time to review the General Information, Content Program Policies, and Tips and Guidelines pages before you begin. We look forward to your submissions!


Submit Papers

Please submit your papers by clicking on the link below and then select the event for which you are submitting.

Submit Now!

General Information

Conferences location: San Francisco, California, USA


Dates

  • Oracle OpenWorld: Sunday, September 22, 2013–Thursday, September 26, 2013
  • JavaOne: Sunday, September 22, 2013–Thursday, September 26, 2013
  • MySQL Connect: Saturday, September 21–Monday, September 23, 2013

Key 2013 deadlines

Deliverables

Due Dates

Call for Proposals–Open

Wednesday, March 13

Call for Proposals–Closed

Friday, April 12, 11:59 p.m. PDT

Notifications for accepted and declined submissions sent

Mid-June

For Oracle OpenWorld, Oracle employee submitters will need to contact the appropriate Oracle track leads before submitting. To view a list of track leads, click here

Contact us:

Friday Feb 22, 2013

Take a FREE Test Drive with Oracle Advanced Analytics/Data Mining on the Amazon Cloud

I wanted to highlight a wonderful new resource provided by our partner Vlamis Software.  Extremely easy!  Fill out the form, wait a few minutes for the Amazon Cloud instance to start up and them BAM!  You can login and start using the Oracle Advanced Analytics Oracle Data Miner work flow GUI.  Demo data and online Oracle by Example Learning Tutorials are also provided to ensure your data mining test drive is a positive one,  Enjoy!! 

Test Drive -- Powered by Amazon AWS

We have partnered with Amazon Web Services to provide to you, free of charge, the opportunity to work, hands-on, with the latest of Oracle's Business Intelligence offerings. By signing up to one of the labs below, Amazon's Elastic Cloud Computer (EC2) environment will generate a complete server for you to work with.

These hands on labs are working with the actual Oracle software running on the Amazon Web Services EC2 environment. They each take approximately 2 hours to work through and will give you hands-on experience with the software and a tour of the features. Your EC2 environment will be available for you for 5 hours, at which time it will self-terminate. If, after registration, you need additional time or need further instructions, simply reply to the registration email and we would be glad to help you.

Data Mining

This test drive walks through some basic exercises in doing predictive analytics within an Oracle 11g Database instance using the Oracle Data Miner extension for Oracle SQL Developer. You use a drag-and-drop "workflow" interface to build a data mining model that predicts the likelihood of purchase for a set of prospects. Oracle Data Mining is ideal for automatically finding patterns, understanding relationships, and making predictions in large data sets.

<script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-46756583-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>

Monday Jan 28, 2013

BIWA Summit 2013 Presentations Now Available for Viewing

If you missed the BIWA Summit 2013, you can still look through the presentations from the event. 

Go to the Schedule at http://www.biwasummit.com/schedule  and download the presentations using the links for each session.  You can forward this to customers, prospects and others within Oracle.  All is external.

The Oracle BIWA Summit, organized by the leading Oracle Special Interest Group (SIG) for Business Intelligence, Data Warehousing and Analytics professionals, was be held on Jan 9,10 2013, at The Oracle HQ Sofitel Hotel, in Redwood City, CA. The Oracle BIWA Summit brings together Oracle ACE experts, customers who are currently using or planning to use Oracle BI, Warehousing and Analytics products and technologies, partners and Oracle Product Managers, Support Personnel and Development Managers. Everything and everyone that you will need to be successful in your Oracle “BIWA” implementations was at the Oracle BIWA Summit, Jan 9-10, 2013.

The next BIWA Summit will be at the HQ Conference Center, Jan 14-16, 2014.  Mark your calendars.

About

Everything about Oracle Data Mining, a component of the Oracle Advanced Analytics Option - News, Technical Information, Opinions, Tips & Tricks. All in One Place

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today