Predictive Analytics is one of the widely used flavours of Analytics. Nowadays, most of the customers want to leverage machine learning(ML) techniques to identify the likelihood of future outcomes based on historical data. To predict the future KPIs appropriate Machine learning Models require to be developed and used for predictive analytics.

We can implement ML either in Oracle Analytics Cloud/Oracle Analytics Server or in Oracle Database. Data Flow is gui based tool in OAC/OAS which can be used to develop the ML model, however in database it can be implemented using Oracle Data Miner or any programming language such as Python, R etc. We will discuss all three options in this blog

 

OracleMachineLearning

 

Sample Use Case

A retail bank has a requirement to perform churn prediction analytics on their current customer data. They have historical customer churn data with various attributes for e.g. Customer ID, Name,Credit Score,Geography,Gender, Age,Tenure,Balance,Number of Products, has Credit Card, Is Active, Estimated Salary, Exited(Churn).

We will train a ML model on historical customer churn data and then use this model on current customer data to predict churn probability.

(Sample bank customer data sets are provided on Github repository, which can be imported in Oracle Database).

Machine Learning with Oracle Analytics Cloud/Oracle Analytics Server

Features

  • Gui based tool to implement
  • Various buit-in algorithms
  • No coding skills required
  • Predicted Data ready for quick visualisation
  • Scheduling is possible to pre-calculate during off business hours
  • Preferable for small datasets

Train the Machine Learning Model

  1. Create a dataset from bank customer churn historical data
  2. Create a Data Flow and take the bank customer churn historical data as Data Source.
  3. Select the attributes which are significant for customer churn prediction. For e.g. Rownum, Customer ID and Surname are not required.
  4. Select the appropriate ML algorithm for e.g Train Binary Classifier in this case, and select the target attribute which is Exited(Churn) in this case.
  5. Save the Model with Appropriate name.
  6. Run the DataFlow, it will create a machine learning Model in OAC/OAS.

DataFlow

 

7. Navigate to machine learning tab from home Page burger menu and the check the accuracy of the model.

 

CheckAccuracy

 

Use Trained Machine Learning Model for Prediction

The machine learning model has been trained in first step, now it can be applied on current customer data to predict the probability of customer churn.

  1. Create another Data Flow and take current customer data (please note that this dataset does not have “Exit”(churn) attribute and this will be predicted using previously trained model)
  2. Apply the ml model which has been trained in last step(“Train the Model”) and save this dataset to perform Analytics.

 

DataFlow2

 

3. This Dataset will have churn prediction( 0 or 1) and churn probability. Now this can be used for Analytics

 

Visualisation

 

Machine Learning with Oracle Data Miner in Oracle Database

Oracle Data Miner enables data scientists and business and data analysts to work directly with data inside the database using a graphical “drag and drop” workflow editor. Oracle Data Miner, an extension to Oracle SQL Developer, captures and documents in graphical analytical workflows the steps users take while exploring data and developing machine learning methodologies.

Features

  • No extracting data to separate ML engine
  • Fast and scalable
  • Explanatory prediction details
  • Handle large volume of data
  • GUI based tool
  • No coding skills required
  • Generates SQL and PL/SQL scripts from workflows to automate and accelerate model deployment throughout the enterprise
  • Work with big data sources, including Oracle Database, Spark, Hadoop and other data sources

 

OracleDataMiner

 

Train the Machine Learning Model and Use it for Prediction

  1. Open Data miner tab in SQL developer and create a new project.
  2. Create new workflow and save it with any name for e.g. “CustomerChurn”
  3. Drag and drop Data Source node from Tool Palette and select customer_churn_history table(having customer’s historical data)
  4. Drag and drop sample node and configure sample size for train the model
  5. Drag and drop classification Model to use set of classifications algorithms
  6. Run all the nodes and check accuracy results of all four algorithms by right click and selecting on “compare rest results” of classification model node.
  7. Optionally, you can take one more sample from customer_churn_history and retest the model for accuracy.
  8. Now take one more data source node from Tool Palette and select customer_churn_current table(customer’s current data)
  9. Put apply model node from tool palette and apply model on current data source
  10. write back churn prediction in database with create table or update table node.

ODMWorkFlow

ODMWorkFlow2

11. Connect OAC/OAS with database and perform analytics on this predicted Data.

Visualisation2

To know more about Oracle Data Miner, kindly follow this documentation.

https://www.oracle.com/big-data/technologies/dataminer/

Use this oracle documentation to setup and configure oracle data miner.

https://www.oracle.com/webfolder/technetwork/tutorials/obe/db/12c/r1/dm/dm_41/ODM12c-41_SetUp.html

 

Machine Learning with Programming Language in Oracle Database

Another way of leveraging machine learning in Oracle Database is to use any programming language for e.g. SQL, Python, R etc. Oracle machine learning provides interfaces for three popular data science languages SQL, R, and Python, which are OML4SQL,OML4Py and OML4R.

Here I am taking Python as an example.

Features

  • No extracting data to separate ML engine
  • Instant Deployment.
  • Optimized algorithms.
  • Models produced faster and with less effort.
  • Better data protection and security.
  • ML access for Data Scientists and non-experts.
  • Can leverage automated machine learning(Auto Ml).
  • Flexibility to use all open-source Libraries.
  • Coding Skills required.
  • Handle Large Volume of Data.

Train the Machine Learning Model and use it for Prediction

  1. Connect the Oracle Database using OML library.
  2. Fetch the bank historical data to train the ml model.
  3. Test the ml model for accuracy.
  4. Fetch customer_churn_current table(current customer data) to predict the customer churn.
  5. Write back/update the predicted data in database.
  6. Connect Oracle Analytics Cloud/Oracle Analytics Server with database and perform analytics.

OML4PY

Visualisation3

 

Sample Data files and scripts

Please refer Github repository for sample data files and scripts.

Disclaimer- Please note that these are samples scripts to show to how to use machine learning with Oracle Analytics Cloud/Oracle Analytics Server and Oracle database, kindly do not refer for learning machine learning concepts.

ChurnBank CustomersHistory.csv :- Bank’s Customer Churn Historical data

ChurnBankCustomersCurrent.csv:- Bank’s Customer current data

CustomerChurn.xml :- Oracle Data Miner workflow xml file(you can import in Sql Developer)

ChurnBankCustClsDBPredict.py:- Sample Python script to connect Oracle database, train the model and perform prediction

GITHUB Repository

https://github.com/amittyagi27/MachineLearning.git