X

Learn about Oracle Machine Learning for Oracle Database and Big Data, on-premises and Oracle Cloud

  • March 3, 2021

Using Oracle Machine Learning for Python REST API

In our previous post, we highlighted OML4Py’s Embedded Python Execution using the Python API. In this post, I’ll introduce the REST API for invoking Python functions to be run in Python engines spawned and managed by the database environment.

Introduction

Using any client that supports REST calls, such as cURL or Postman, you can invoke scripts in the OML4Py script repository from a REST client. The OML4Py REST API includes endpoints for listing and running user-defined Python functions, also referred to as scripts, saved in the OML4Py script respository.

A typical workflow includes:

  • Authenticating to the Autonomous Database service 
  • Creating a user-defined Python function and saving it as a named script in the OML4Py script repository
  • Invoking the saved script from the REST client

Prerequisites

The REST API for Embedded Python Execution requires that you specify the name of a script in the OML4Py script repository in your Autonomous Database.

Get Account Information

The account creation email from the Autonomous Database service contains the user name and password for your Oracle Machine Learning (OML) account. If you don't have this information, then contact your Autonomous Database administrator.

Get an Authentication Token

OML REST APIs use tokens to authenticate an OML user. You access the OML4Py REST API Embedded Python Execution functions over HTTP and must provide the user name and password for your OML account.

To authenticate and obtain a token, use cURL with the -d option to pass the user name and password for your Oracle Machine Learning account against the Oracle Machine Learning User Management Cloud Service REST endpoint /oauth2/v1/token.  

The following provides an example cURL command. The example uses the following values:

  • omlserver is the Autonomous Database URL
  • tenant is the Autonomous Database tenancy OCID
  • username is the OML user name
  • password is the OML user password
  • database is the database name

$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{"grant_type":"password", "username":"'${username}'", "password":"'${password}'"}' "${omlserver}/omlusers/tenants/${tenant}/databases/${database}/api/oauth2/v1/token"

If the request is successful, you will receive an HTTP 200 response header with the access token included in the body:

HTTP/1.1 200 OK
Date: Tues, 2 Mar 2020 21:59:16 GMT
Content-Type: application/json
Content-Length: 1027
Connection: keep-alive
Cache-Control: no-cache, no-store, private
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1;mode=block
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff

{"accessToken":"eyJhbGciOiJ...==","expiresIn":3600,"tokenType":"Bearer"}

Copy the entire accessToken value from the response and assign it to the variable token, surrounded by single quotes:

$ export token='eyJhbGciOiJ...=='

You can now use the token variable in calls to the REST API for Embedded Python Execution endpoints. The access token has a life span of one hour and can be refreshed before it expires.

Score a model saved in the OML4Py datastore

In this example, we use an sklearn model saved in the OML4Py Datastore to score data.

Model regr is defined in OML4Py and saved to the Datastore, then loaded in the scoring function regr_predict. The regr_predict function is saved in a script residing in the OML4Py script repository and deployed from the REST API.

Step 1: Load a Pandas Dataframe to a database table

A feature of the transparency layer is the creation of database objects for Python objects. For example, you can import a data set into your Python session and then create database objects from that data set.

Using the OML4Py Python notebook interpreter, we load the sklearn iris data set and combine target and predictor values into a single Pandas DataFrame. Then we use the oml.create function to create a persistent database table IRIS from the DataFrame.

import pandas as pd import oml from sklearn.datasets import load_iris iris = load_iris() x = pd.DataFrame(iris.data,                   columns = ["SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH"]) y = pd.DataFrame(list(map(lambda x: {0:'setosa', 1: 'versicolor', 2:'virginica'}[x], iris.target)),                   columns = ['SPECIES'])
iris_df = pd.concat([x,y], axis=1) IRIS = oml.create(iris_df, table = "IRIS")

Step 2: Build a regression model

Build a regression model regr and save it to the datastore.

from sklearn import linear_model # Build a regression model using in-memory data. iris = IRIS.pull() regr = linear_model.LinearRegression() regr.fit(iris[['SEPAL_WIDTH', 'PETAL_LENGTH', 'PETAL_WIDTH']], 
         iris[['SEPAL_LENGTH']])
regr.coef_

array([[ 0.65083716, 0.70913196, -0.55648266]])

Save the model regr to the OML4Py Datastore ds_regr:

oml.ds.save(objs={'regr':regr}, name="ds_regr", grantable=True, overwrite=True) oml.ds.dir()

datastore_name object_count size date description ds_regr        1            496  2020-11-16 20:20:08 None

Step 3: Define a scoring function

Define a scoring function named regr_predict as a string and save it to the OML4Py script repository. The regr_predict function loads the datastore containing the model and scores the model.

regr_predict = """def regr_predict(dat):     import pandas as pd     import oml     obj_dict = oml.ds.load(name="ds_regr", to_globals=False)      regr = obj_dict["regr"]                                      pred = regr.predict(dat[['SEPAL_LENGTH',                               'SEPAL_WIDTH',                              'PETAL_LENGTH']])     return pd.concat([dat[['SPECIES', 'PETAL_WIDTH']],                       pd.DataFrame(pred,                                    columns=['Pred_PETAL_WIDTH'])],                                    axis=1)""" oml.script.create("regr_predict", func=regr_predict, overwrite=True) 

Step 4: Score the model

Now use row-apply to score the model on the IRIS table from the Python interface, and then invoke the equivalent REST API call.  We run the script regr_predict from the REST API in the default mode, which is synchronous. The scoring runs in parallel using row-apply with 5 rows per parallel engine.  

oml.row_apply(IRIS, regr_predict, rows=5, parallel=True)    SPECIES  PETAL_WIDTH  Pred_PETAL_WIDTH 0 setosa       0.2                     6.846241 1 versicolor  1.1                      5.278649 2 versicolor  1.0                      5.095180 3 versicolor  1.2                      5.248665  ... ... ... ... ... ... ... ...

From the REST API, you can specify the Autonomous Database service level to use with the optional service argument. The service names are LOW, MEDIUM, and HIGH. The service names provide different levels of performance and concurrency for an Autonomous Database. In this case, we use service level MEDIUM.

curl -i -X POST --header "Authorization: Bearer ${token}" --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{"input":"select * from IRIS", "parameters":"{\"oml_input_type\":\"pandas.DataFrame\"}", "rows":5, "parallelFlag":true, "service":"MEDIUM"}' "${omlserver}/oml/tenants/${tenant}/databases/${database}/api/py-scripts/v1/row-apply/regr_predict"

A portion of the returned value is:

{"result":[{"PETAL_WIDTH":0.2,"SPECIES":"setosa","Pred_PETAL_WIDTH":6.8462408185},{"PETAL_WIDTH":1.1,"SPECIES":"versicolor","Pred_PETAL_WIDTH":5.2786489228},{"PETAL_WIDTH":1,"SPECIES":"versicolor","Pred_PETAL_WIDTH":5.0951801182},{"PETAL_WIDTH":1.2,"SPECIES":"versicolor","Pred_PETAL_WIDTH":5.24866547}...

Note that, while cURL is used to run REST commands in this blog, any REST client can be used.

The OML4Py REST endpoints can be called from a web development application, improving the portability of the interface to other types of platforms, increasing project scalability, and enabling data science teams to own the end-to-end lifecycle of machine learning scripts from training to production. 

Please visit the OML4Py REST API for Embedded Python Execution Guide for details on the available REST endpoints and examples. Additionally, refer to the OML4Py REST API Template notebook example in your Autonomous Database OML user account for additional workflow examples.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.