X

An Oracle blog about PeopleSoft Technology

Introduction to Machine Learning with PeopleTools 8.58

Rahul Mahashabde
Architect

This post is the first in a series on "Machine Learning with PeopleTools."

PeopleTools 8.58 provides the Peoplesoft Data Distribution Framework, an easy mechanism to integrate with Machine Learning (ML) Platforms. We now have a new menu item – “Machine Learning” in PeopleTools 8.58. Using this, the data in PeopleSoft Databases can be flattened and then pushed to an Elasticsearch index. The Elasticsearch index can then act as a primary data source for training an ML Model.

OCI Data Science is a new Oracle Cloud Infrastructure (OCI) service. It provides a platform for Data Scientists to build, train, and manage Machine Learning Models using Python Jupyter Notebook. The service provides access to a library of utilities and Model explanation tools. These products make it easy for users to build ML models and provide a way to explain the Model results. The Oracle Accelerated Data Science (ADS) SDK is a Python library that is included as part of the Oracle Cloud Infrastructure Data Science service. ADS library provides a friendly interface that simplifies many of the steps in the Data Science workflow such as connecting to data, exploring and visualizing data, training a model with AutoML, evaluating models, and explaining models.

At a high level, these are the steps needed to create an ML data source in PeopleSoft  –

  1. Create a new Search Definition in PeopleSoft or reuse an existing one so that all the ML attributes for the specific use case are captured. This is the place where data can be gathered from multiple sources and flattened for ML.
  2. Create an ML Data source using the Search Definition in PeopleSoft.
  3. Publish the Elasticsearch Index based on this Data source.
  4. Ingest the data for the Elasticsearch Index.

 

 

Once the data is available in an Elasticsearch Index, it is ready to be consumed in OCI Data Science. The next step would be to procure an OCI account if not already available and enable Data Science service for the account. The Data Science service can then be accessed to train and build the ML model. The following steps should be followed to create and save a ML Model in Data Science:

  1. Since Data Science is a collaborative service, we start by creating a new Data Science Project. The Jupyter notebook session is created inside the project.
  2. In the Jupyter notebook, the data in the Elasticsearch index can be accessed via the standard python Elasticsearch connector (this comes pre-loaded as a part of the notebook session).
  3. The data can then be used to train and build the ML model. For training the model, the delivered Oracle AutoML library and the ML explanation library can also be utilized.
  4. The model once built can be saved in the standard Python “pkl” format.

The saved ML Model can be consumed for doing the runtime inference in PeopleSoft. Whenever there is a need to retrain, a new ML model can be regenerated and then consumed. For consumption, one can choose to expose the ML Model as a REST API or load the “pkl” file and call it locally. In the next set of blog posts, we will go into details of each of the steps from getting the data in PeopleSoft to using the ML model for runtime inference.

PeopleSoft is planning to start an early adopter program to help customers use the Data Distribution Framework to jumpstart AI/ML projects.  More information about this program and pre-requisites will be available soon.

More details on OCI Data Science Service can be found here:

OCI Data Science Service Documentation

Accelerated Data Science Library

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.