In this post, I show you how to build and train a model in a notebook session of the Oracle Cloud Infrastructure (OCI) Data Science service and deploy that model with Oracle Functions.
You can download a notebook (.ipynb) version of this blog post. I recommend that you use and run the notebook in your notebook session, instead of copying code snippets in a blank notebook.
Before we get started, upgrade the oci client to the latest version. Some of the capabilities around logging and log search are only available in the recent versions of oci. In a cell of your notebook, execute the following cell to match the packages I’m using in my notebook session:
!pip install oci --upgrade
!pip install fdk==0.1.18
!pip install cloudpickle==1.6
!pip install pandas==1.1.0
!pip install numpy==1.18.5
!pip install scikit-learn==0.23.2
To get started, I train a simple binary classifier using scikit-learn in a notebook session of the OCI Data Science service. The business problem itself and the quality of the model don’t really matter, and the model is only used for illustrative purposes. Any binary classifier would do fine.
I use the scikit-learn make_classification() function to create a balanced synthetic binary classification dataset and use it to train a random forest classifier. The model takes in eight numerical features labeled feat1– feat8.
You can do the same by launching a notebook session in Data Science service, and then executing the following cell in your notebook. I use resource principals to authenticate to the model catalog, Oracle Object Storage, and Oracle Functions. I recommend going over the getting-started.ipynb notebook to get your session set up with resource principals.
import warnings
warnings.filterwarnings('ignore')
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
import numpy as np
import pandas as pd
import ads
# Using resource principal to authenticate when using the model catalog:
ads.set_auth(auth='resource_principal')
# Here we create a simple and balanced binary classification dataset with 8 features,
# and a sample of 10K rows.
X, y = make_classification(n_samples=10000, n_features=8, n_informative=4,
random_state=0, shuffle=True)
# convert to a pandas dataframe for easy manipulations:
train = pd.DataFrame(X, columns=['feat1', 'feat2', 'feat3', 'feat4',
'feat5', 'feat6', 'feat7', 'feat8'])
target = pd.Series(y)
target.value_counts()
From the value_counts() call you can see that the dataset is balanced which is what I want.
Next, train a simple binary classifier. Use the following RandomForestClassifier, also available in scikit-learn:
# training the random forest classifier from scikit-learn:
clf = RandomForestClassifier(max_depth=2, random_state=0)
clf.fit(train, target)
You can check the accuracy of the model on the training dataset using the following command:
predictions = clf.predict(X)
diff = np.abs(predictions - y)
print(f"accuracy (train):{1.0 - sum(diff) / len(predictions)}")
I get an accuracy of about 0.75 which is fine. You should get similar results in your notebook.
Now, we have a working binary classifier. Save this model to the model catalog.
To save the model in the catalog, we use the ADS library and its prepare_generic_model() function, which probably is the easiest method. First, create a temporary local directory where we’re going to store the model artifact files:
from ads.common.model_artifact import ModelArtifact
from ads.common.model_export_util import prepare_generic_model
import os
#Replace with your own path:
path_to_rf_artifact = f"artifact-nov2020"
if not os.path.exists(path_to_rf_artifact):
os.mkdir(path_to_rf_artifact)
I use ADS prepare_generic_model() to create all the necessary templatized files that are part of the model artifact. You still need to modify each one of the files in the artifact to fit your particular use case.
In the next few cells, I go through each of the files that I modified and created. All the files in the target artifact directory are compressed and shipped to the model catalog as your model artifact.
artifact = prepare_generic_model(path_to_rf_artifact, force_overwrite=True)
In the latest release of the notebook session environment, ADS also generates all the Oracle Functions artifact files (func.py, func.yaml, requirements.txt) by default. Using these files, I later deploy my model as an Oracle function.
First, serialize the random forest classifier and save it to disk. I use joblib to save my model to disk, which is the preferred way for scikit-learn models.
from joblib import dump
dump(clf, os.path.join(path_to_rf_artifact, "rf.joblib"))
Now that we have a serialized model object in our artifact directory, modify the func.py file, which contains the definition of the Oracle Functions handler (handler()) function. Oracle Function calls the handler function.
In the following cell, I write a new version of func.py. Executing this cell overwrites the template that ADS provides as part of the model artifact.
There are a few differences to the template. I import the Python logging library and define the model-prediction and model-input-features loggers. I use these loggers to capture the model predictions and the model input features for each call made to the function. With these loggers, I monitor how my predictions and features distributions change over time. Those log entries are captured and stored in the Logging service.
Then, I add some more data transformations in handler(). You can achieve a similar outcome by adding those transformations to the body of predict() in score.py.
%%writefile {path_to_rf_artifact}/func.py
import io
import json
from fdk import response
import sys
sys.path.append('/function')
import score
import pandas as pd
model = score.load_model()
# Importing and configuring logging:
import logging
logging.basicConfig(format='%(name)s - %(levelname)s - %(message)s', level=logging.INFO)
# configuring logging:
# For model predictions:
logger_pred = logging.getLogger('model-prediction')
logger_pred.setLevel(logging.INFO)
# For the input feature vector:
logger_input = logging.getLogger('model-input-features')
logger_input.setLevel(logging.INFO)
def handler(ctx, data: io.BytesIO=None):
try:
input = json.loads(data.getvalue())['input']
logger_input.info(input)
input2 = json.loads(input)
input_df = pd.DataFrame.from_dict(input2)
prediction = score.predict(input_df, model)
logger_pred.info(prediction)
except (Exception, ValueError) as ex:
logger_pred.info("prediction fail {}".format(str(ex)))
return response.Response(
ctx, response_data=json.dumps("predictions: {}".format(prediction)),
headers={"Content-Type": "application/json"}
)
Next, modify the requirements.txt file. ADS generates a template for requirements.txt that provides a best guess at the dependencies necessary to build the Oracle Function and run the model. In this case, modify the template and add the following dependencies on scikit-learn version 0.23.2:
%%writefile {path_to_rf_artifact}/requirements.txt
cloudpickle==1.6
pandas==1.1.0
numpy==1.18.5
fdk==0.1.18
scikit-learn==0.23.2
Last, modify the inference script score.py, which loads the model to memory, and call the predict() method of the model object.
By default, ADS generates this file assuming that you’re using cloudpickle to read the model serialized object. I modified score.py to use joblib and left the definition of predict() intact.
%%writefile {path_to_rf_artifact}/score.py
import json
import os
from joblib import load
"""
Inference script. This script is used for prediction by scoring server when schema is known.
"""
def load_model():
"""
Loads model from the serialized format
Returns
-------
model: a model instance on which predict API can be invoked
"""
model_dir = os.path.dirname(os.path.realpath(__file__))
contents = os.listdir(model_dir)
model_file_name = "rf.joblib"
# TODO: Load the model from the model_dir using the appropriate loader
# Below is a sample code to load a model file using `cloudpickle` which was serialized using `cloudpickle`
# from cloudpickle import cloudpickle
if model_file_name in contents:
with open(os.path.join(os.path.dirname(os.path.realpath(__file__)), model_file_name), "rb") as file:
model = load(file) # Use the loader corresponding to your model file.
else:
raise Exception('{0} is not found in model directory {1}'.format(model_file_name, model_dir))
return model
def predict(data, model=load_model()) -> dict:
"""
Returns prediction given the model and data to predict
Parameters
----------
model: Model instance returned by load_model API
data: Data format as expected by the predict API of the core estimator. For eg. in case of sckit models it could be numpy array/List of list/Panda DataFrame
Returns
-------
predictions: Output from scoring server
Format: { 'prediction': output from `model.predict` method }
"""
assert model is not None, "Model is not loaded"
# X = pd.read_json(io.StringIO(data)) if isinstance(data, str) else pd.DataFrame.from_dict(data)
return { 'prediction': model.predict(data).tolist() }
Always test your model artifact in your notebook session before saving it to the catalog, especially if your function depends on it.
First, modify the Python path and insert the path where the score.py module is located. Then import score and call the predict() function defined in score.py. Load the train dataset and compare the predictions from predict() to the predictions array that you created right after training model. If load_model() and predict() functions are working correctly, you should retrieve the same predictions array.
# add the path of score.py:
import sys
sys.path.insert(0, path_to_rf_artifact)
from score import load_model, predict
# Load the model to memory
_ = load_model()
# make predictions on the training dataset:
predictions_test = predict(train, _)
# comparing the predictions from predict() to the predictions array I created above.
print(f"The two arrays are equal: {np.array_equal(predictions_test['prediction'], predictions)}")
The two arrays are identical.
Next, test the function’s handler defined in func.py. In the following cell, I use the fnproject Python SDK or fdk to import the context object. The context object is a required argument to the handler() function. It contains attributes such as the application ID, the function ID, the call ID, and the content type of the payload data.
In this example, we don’t know the application ID, the function ID, or the call ID for my deployed function. These attributes are known after the function has been deployed to Oracle Functions. For now, fill the required parameters with “na”, which is fine for testing purposes.
I also take the first five rows of my training dataframe (train[:5]), transform them to JSON, and pass them to the handler. This general example shows how a function can score multiple examples at once:
import func
import json
import logging
from func import handler
from fdk import context
from io import BytesIO
# Example context:
# app_id, fn_id, call_id, and content_type:
ctx = context.InvokeContext("na", "na", "na", content_type='application/json')
# Input feature values:
input_data = {'input':train[:5].to_json()}
json_input = json.dumps(input_data).encode("utf-8")
# calling the Function's handler with my payload:
resp = handler(ctx, BytesIO(json_input))
# remove most logs printed to screen except critical ones
logger = logging.getLogger()
logger.setLevel(logging.CRITICAL)
You should see the outputs of the two loggers that you defined in func.py: a dictionary containing the values of the input features for the five row examples and a prediction array corresponding to the five predictions (0/1) made by my binary classifier.
To help with testing the function in production, capture a sample payload of the training data frame. Optionally, save that sample payload file as part of the model artifact file.
# Capturing a sample payload file for testing the Function in production:
with open(os.path.join(path_to_rf_artifact, "sample-payload.json"),'w') as f:
json.dump(input_data,f)
We’ve tested the files that we need to deploy my model as an Oracle Function. Now, we’re ready to save the model artifact to the model catalog. ADS makes it easy. You can call the save() method of the model artifact that you previously created. If you execute that command, you can see a data frame object with metadata about your model, confirming that the save operation was successful.
You can go back to the project page in the Oracle Cloud Infrastructure Console. Under Models, you should see a new entry corresponding to the model that you saved.
# Saving the model artifact to the model catalog.
artifact.save(display_name='simple-sklearn-model',
description='An initial test to monitor my model in production')
Warning: If you encounter a __pycache__ error while saving your artifact, simply create a directory called __pycache__ in the artifact directory. You can do it via the terminal with this simple shell command. Execute the save() command again and the error should go away.
% mkdir __pycache__
We’ve completed the model training steps and saved an artifact to the model catalog. Next, take the model artifact that you saved, build an Oracle Function Docker image, push that image to the Oracle Registry, and deploy it as an Oracle function.
Fortunately, we can accomplish all these steps using the Cloud Shell service, available in the OCI Console. In principle, you can do all that work from your laptop, but Cloud Shell is easy to use and comes preconfigured with your own OCI credentials, Docker, and the Function CLI (fn).
To access Cloud Shell, go to the Console home page and click the Cloud Shell icon in the top menu bar.
Once you have a Cloud Shell window running, download the model artifact you created by downloading and unzipping the artifact file using these two shell commands. Replace with your model OCID value. You can find the model OCID value in the model details page.
% oci data-science model get-artifact-content --model-id <your-model-ocid> --file test.zip
% unzip test.zip
Create an Oracle Function application to host your machine learning model function. Go to the Console and, under Developer services, select Functions. Then select Create Application.
After creating your application, ensure that your application has logs enabled. Under the Resources menu, go to Logs. Select a log group, log name, and retention period for your logs.
Once you’ve created an application, I recommend that you go through Getting Started in the Resources menu of your application. The guide walks you through the necessary steps to configure the fn CLI in your Cloud Shell environment and Docker.
Once you’ve completed the Getting Started guide, you can deploy your function. Change the value <your-app-name> to the name of the application you previously created.
fn --verbose deploy --app <your-app-name>
You should see a successful Docker image build in Cloud Shell. The image is stored in the Registry, and you can find your new function in your application in the Console.
Now that your Oracle Function has been successfully deployed, you need to test it. You can easily test the function by submitting a few requests using the OCI Python SDK directly in your notebook. If you prefer to use Cloud Shell, you can also invoke your model directly with the fn CLI tool.
Import the libraries that you need, most notably oci and the functions client.
import matplotlib.pyplot as plt
import datetime
import gzip
import oci
from oci import pagination
import oci.functions as functions
from oci.functions import FunctionsManagementClient, FunctionsInvokeClient
I use resource principals in my notebook session to authenticate against Oracle Functions. Instantiate a FunctionsManagementClient object using the resource principals signer object, allowing your notebook session to call functions without using your personal config and key files. The FunctionsManagementClient object allows you to list all applications and functions in my compartment.
# Lets first get the resource principals signer object:
rps = oci.auth.signers.get_resource_principals_signer()
# Instantiate a FunctionsManagementClient object using the resource principals signer:
fn_management_client = FunctionsManagementClient(config={}, signer=rps)
Fill in the compartment application OCID and application and function names:
# Your application compartment OCID:
compartment_id = f"<your-application-compartment-OCID>"
# Your application name:
app_name = f"<your-application-name>"
# Your Function name:
fn_name = f"<your-function-name>"
Next, we need to get the invoke_endpoint of the function. Find the function OCID. If you want to stay in the notebook session environment, you can find your function OCID by listing the applications in the compartment. You can inspect the results by calling app_results.data.
Then, identify the application and extract the OCID of the application by accessing the .id attribute of each entry in the list. Because each application is a collection of functions, the same logic applies at the application level. List the functions inside the application and find the relevant one.
app_result = pagination.list_call_get_all_results(
fn_management_client.list_applications,
compartment_id,
display_name=app_name
)
# app_result.data
# OCID of the first application in the list:
# app_result.data[0].id
fn_result = pagination.list_call_get_all_results(
fn_management_client.list_functions,
app_result.data[0].id,
display_name=fn_name
)
# fn_result.data
# OCID of the first Function in the list:
# fn_result.data[0].id
I have multiple functions in my application, but the relevant one is the first in my list ([0]). Extract the .invoke_endpoint attribute of that function, and pass it to the FunctionsInvokeClient object, which allows you to invoke the function.
invoke_client = FunctionsInvokeClient({}, signer=rps, service_endpoint=fn_result.data[0].invoke_endpoint)
The client object that I instantiated (invoke_client) has an invoke_function() method that allows me to submit requests to my function. It takes the function OCID as the required parameter.
I also cut the training data frame into blocks of five rows and passed them as my JSON payload to my function endpoint. You can execute this cell a few times to simulate calls made to your function. You can change the number of rows you pass to the model in each call.
The first time you call your function, it can take much more time to get a response back (>30 sec). That’s the cold start problem.
For each call you made to your functions, the output of the loggers you created in func.py is collected and stored in log files on object storage.
%%time
# calling my model 10 times. Each time I pass five rows from my training datasets that I converted to a JSON payload (content-type: application/json)
for x in range(10):
resp = invoke_client.invoke_function(fn_result.data[0].id,
invoke_function_body=json.dumps({'input':train[5*x:5*(x+1)].to_json()}))
print(resp.data.text)
Congratulations! You have made it to the end!
In an upcoming post, I show you how you can use the logs generated by your Function to monitor the performance of your model.
- Visit our website
- Visit our service documentation
- (Oracle Internal) Visit our slack channel #oci_datascience_users
- Visit our YouTube Playlist
- Visit our LiveLabs Hands-on Workshop
I've also written post under my full first name. You can find those posts here: https://blogs.oracle.com/ai-and-datascience/authors/Blog-Author/COREA7667DA212B34765B4DB91B94737F00E/jean-rene-gauthier
Previous Post
Next Post