Machine Learning is advance version of analytics as it helps organisations to analyse trends in customer’s behaviour and it provides business operational trends which help them in decision making. Implementing machine learning require expertise in ML tool and technologies.
Oracle Auto ML(automated machine learning) is a key feature of Oracle machine learning which helps non expert data scientist to build and use an efficient data model. Auto Ml consists of various pre built classes which supports in identifying best machine learning algorithm, appropriate feature sets and model hyperparameters. Auto Ml only supports classification and regression algorithms.
Auto ML Algorithm Selection
The class for Algorithm selection is oml.automl.AlgorithmSelection.This class rank alogorithms from the set of supported OML algorithms based on provided dataset.
To use automl.AlgorithmSelection pass the mining function as “classification” or “regression” based on target output and corresponding score metric for e.g. r2, f1_macro etc. please refer this Documentation for supported metric for binary classification, multi classification and regression.
The “select” method return the sorted list of top algorithms and their predicted rank.
Kindly refer this example to get more details on class automl.AlgorithmSelection.
#Import libraries
import oml
from oml import automl
from sklearn import linear_model
import pandas as pd
import os
#Connect to Database
oml.connect(user=’oml_user1′, password=’oracle’, host=’localhost’, port=1521, service_name=’orclpdb19′, automl=True)
print(oml.isconnected())
# Pull data in python object from dB table
coviddataset=oml.sync(table=’COVIDDATASET’)
coviddataset=coviddataset.drop(‘ISO_CODE’,)
coviddataset
# Split Data in train and test
train, test = coviddataset.split(ratio=(0.8, 0.2), hash_cols=’REPORTEDDATE’,seed=32)
x_train, y_train = train.drop(‘TOTAL_CASES’), train[‘TOTAL_CASES’]
x_test, y_test=test.drop(‘TOTAL_CASES’), test[‘TOTAL_CASES’]
#AUTOMATED ALGORITHM SELECTION
# Create an automated algorithm selection object with r2 as the score_metric argument.
algo= automl.AlgorithmSelection(mining_function=’regression’,
score_metric=’r2′, parallel=4)
# Run algorithm selection to get the top k predicted algorithms and their ranking without tuning.
algo_ranking=algo.select(x_train,y_train,k=3)
# Show the selected and tuned model.
[(m, “{:.2f}”.format(s)) for m,s in algo_ranking]
OutPut: [(‘nn’, ‘0.93’), (‘glm_ridge’, ‘0.91’), (‘glm’, ‘0.91’)
Auto ML Feature Selection
The class for feature selection is oml.automl.FeatureSelection. This class automatically select most relevant features for a dataset and model.
To use oml.automl.FeatureSelection specify the mining function “classification” or “regression” and score metrics for e.g. r2, f1_macro etc. Please refer this Documentation for to check all score metrics
Use the reduce function and pass dataset and oracle machine learning algorithm to get relevent feature subset of dataset.
Kindly refer this example (in continuation of above code) to get more details on class oml.automl.FeatureSelection.
# AUTOMATED FEATURE SELECTION
# Calculate model performance before feature selection.
model = oml.glm(mining_function=’regression’).fit(x_train, y_train)
score=(model.score(x_test, y_test))
score
Output: 0.932823
# Let’s try Automl feature selection with score_metric coefficient of determination(r2).
fs = automl.FeatureSelection(mining_function=’regression’, score_metric=’r2′, parallel=4)
# Find the reduced feature subset on the train data set.
fsubset = fs.reduce(‘glm’, x_train, y_train)
“{} features reduced to {}”.format(len(x_train.columns),len(fsubset))
Output: ’27 features reduced to 6′
# Use the subset to select the features and create a model on the new reduced data set.
x_train_new = x_train[:,fsubset]
x_test_new = x_test[:,fsubset]
model = oml.glm(mining_function=’regression’).fit(x_train_new, y_train)
score=model.score(x_test_new, y_test)
“{:.2} with {:.1f}x feature reduction”.format((score), len(x_train.columns)/len(x_train_new.columns))
Output: ‘0.92 with 4.5x feature reduction’
Auto ML Model Tuning
The class for model tuning is oml.automl.ModelTuning. This class tunes the hyperparameters of algorithm and training dataset.
To use oml.automl.ModelTuning specify the dataset and algorithm and as an output you will get the tuned model and corresponding hyperparameters.
You can also pass provide customized hyperparameter search space and scoring metrics, for more details please refer this Documentation.
Kindly refer this example(in continuation of above code) to get more details on class oml.automl.ModelTuning.
## MODEL TUNING
# Start automated model tuning run with a general linear regression.
at = automl.ModelTuning(mining_function=’regression’, parallel=4)
results = at.tune(‘glm’, x_train, y_train)
# Show the tuned model details.
tuned_model = results[‘best_model’]
tuned_model
Output: Algorithm Name: Generalized Linear Model
Mining Function: REGRESSION
Target: TOTAL_CASES
Settings:
setting name setting value
0 ALGO_NAME ALGO_GENERALIZED_LINEAR_MODEL
1 GLMS_CONF_LEVEL .95
2 GLMS_FTR_GENERATION GLMS_FTR_GENERATION_DISABLE
3 GLMS_FTR_SELECTION GLMS_FTR_SELECTION_DISABLE
4 GLMS_NUM_ITERATIONS 30
5 GLMS_RIDGE_REGRESSION GLMS_RIDGE_REG_DISABLE
6 GLMS_SOLVER GLMS_SOLVER_CHOL
7 ODMS_DETAILS ODMS_DISABLE
8 ODMS_MISSING_VALUE_TREATMENT ODMS_MISSING_VALUE_AUTO
9 ODMS_SAMPLING ODMS_SAMPLING_DISABLE
10 PREP_AUTO ON
# Show the best tuned model train score and the corresponding hyperparameters.
print(results[‘all_evals’][0])
score, params = results[‘all_evals’][0]
“{:.2}”.format(score), [“{}:{}”.format(k, params[k]) for k in sorted(params)]
Output: (0.92, {‘GLMS_NUM_ITERATIONS’: 30, ‘GLMS_SOLVER’: ‘GLMS_SOLVER_CHOL’})
# Use the tuned model to get the score on the test set.
“{:.2}”.format(tuned_model.score(x_test, y_test))
Output: 0.94
Auto ML Model Selection
The class for model selection is oml.automl.ModelSelection. This class automatically select Oracle machine learning algorithm based on given score metric and then tune the algorithm.
To use ml.automl.ModelSelection just pass the dataset and number of algorithm to tune.
Use select method to get the best model out of all analysed models.
kindly refer this example to get more details on class oml.automl.ModelSelection.
# MODEL SELECTION
# Create an automated model selection object with r2 as the score metric.
modelsel = automl.ModelSelection(mining_function=’regression’, score_metric=’r2′, parallel=4)
# Run model selection to get the top (k=1) predicted algorithm (defaults to the tuned model).
selectmodel = modelsel.select(x_train, y_train, k=1)
# Show the selected and tuned model.
selectmodel
Output: Algorithm Name: Neural Network
Mining Function: REGRESSION
Target: TOTAL_CASES
Settings:
setting name setting value
0 ALGO_NAME ALGO_NEURAL_NETWORK
1 LBFGS_GRADIENT_TOLERANCE .000000001
2 LBFGS_HISTORY_DEPTH 20
3 LBFGS_SCALE_HESSIAN LBFGS_SCALE_HESSIAN_ENABLE
4 NNET_ACTIVATIONS ‘NNET_ACTIVATIONS_LOG_SIG’,’NNET_ACTIVATIONS_L…
5 NNET_HIDDEN_LAYERS 2
6 NNET_ITERATIONS 100
7 NNET_NODES_PER_LAYER 50,50
8 NNET_REGULARIZER NNET_REGULARIZER_NONE
9 NNET_TOLERANCE .000001
10 ODMS_DETAILS ODMS_DISABLE
11 ODMS_MISSING_VALUE_TREATMENT ODMS_MISSING_VALUE_AUTO
12 ODMS_RANDOM_SEED 0
13 ODMS_SAMPLING ODMS_SAMPLING_DISABLE
14 PREP_AUTO ON
# Score on the selected and tuned model.
“{:.2}”.format(selectmodel[0].score(x_test, y_test))
Output: 0.94
Auto ML on Oracle Cloud (OCI)
Oracle cloud customer can leverage Oracle machine learning AutoML user interface. It is an oml component on Oracle Autonomous Database. It provide a no-code browser-based interface to data scientist which they can use for ML modelling and auto deployment of model.In Auto ML user interface provide all inputs for e.g. Datasource, Pediction type “Regression” or “Classification”, select the algorithms you wants to compare and start the Job. When the job is completed, you will have set of models trained and available within Autonomous Data Warehouse. You can see the ranking of algorithms, features importance etc. on the auto ml leader board.
Please follow This Blog for more details on Auto ML on OCI.
