Recently, artificial intelligence (AI) and machine learning (ML) have become key enablers for organizations to remain ahead of the competition and provide value to their shareholders. Enterprises are well-supplied with machine learning initiatives, but most fail to achieve their stated goals. As important as having an outcome-driven approach for a successful ML initiative is, aligning the right ML technology with the purpose of the initiative and the capabilities of the organization is equally important. No one size fits all solution exists, but Oracle Cloud Infrastructure (OCI) provides a robust portfolio of machine learning and artificial intelligence offerings to help customers apply the right fit for use and purpose.

Data categories for machine learning workloads

The data that we use to derive intelligence comes in many varieties as part of the enterprise continuum. Traditional enterprise data is highly structured and resides in the transactional and data warehouse system. The other category of data is semi-structured, which has some organizational properties. Typical examples include XML, emails, medical forms, and invoices. The last category of data is completely unstructured, such as images, social data, and media. The following image identifies the distinct categories.

A graphic depicting the categories of data types.

With categories of data, which AI or ML offering to choose can depend on the persona of the user. For example, developers with little knowledge of data science can use AI services. People with deep database and SQL skills can use Oracle Database machine learning. Regular data scientists can benefit from OCI Data Science service. The last section, Big AI, is for developers of large AI models who have an army of niche data scientists and machine learning ops personnel. Looking at the categories of data emphasizes that we can use ML closest to the source depending on the business problem. Ultimately, the business problem dictates which service or combination of services we use.

Robust Oracle Data Science, AI, and ML solutions to rescue

The approaches to handling data and algorithms differ across different data types. The level of difficulty progressively increases as we move from structured to unstructured data. Also, the need for deriving intelligence closest to the source is becoming even more important than ever because of real-time needs. This need is amplified for algorithmic trading and internet of things (IoT), where doing ML at source or closest, usually referred to as edge, is imperative, leading to in-system machine learning, machine learning at edge, and other options. An enterprise with robust structured data working on a business problem outside a data center can use machine learning within the database, instead of developing a set of external systems. For details on Oracle’s Roving Edge Infrastructure, see Roving Edge Infrastructure.

Oracle has developed a robust set of AI and ML solutions to cater to different needs that collectively provide exhaustive AI services for the enterprise. Oracle AI is a portfolio of cloud services for helping organizations take advantage of all data for the next generation of scenarios.

A graphic depicting the unified services for AI and machine learning provided by OCI.

AI Services

OCI AI Services are a collection of services with prebuilt machine learning models that make it easier for developers to apply AI to applications and business operations. The data types used can range from completely structured to unstructured, depending on the applied AI services. Some of the AI services are pretrained, and all are used by calling the API for the service and passing in data to be processed, and then the service returns a result. You have no infrastructure to manage, and the services are priced based on the number of API calls per month. With OCI Data Science, the developer invokes the API for the appropriate AI service and provides the payload. For more details, see AI Services.

Machine Learning in Oracle Database

Oracle has a robust solution for machine learning with enterprise data in the data mart and data warehouse. Oracle Database Machine Learning is best suited and accelerates time to market for organizations that already have invested in building data marts and have massive enterprise data sets as structured data in Oracle data marts. This service provides ability to exploit and develop AI models with their existing data mart and data warehouses built on Oracle databases without any other investment. Machine learning in Oracle Database supports data exploration, preparation, and machine learning modeling at scale using SQL, R, Python, REST, AutoML, and no-code interfaces. It includes more than 30 high-performance in-database algorithms producing models for immediate use in applications. For more details, see Machine Learning in Oracle Databases.

OCI Data Science

OCI Data Science cloud service focuses on serving the data scientist throughout the machine learning life cycle with support for Python and open source code. OCI Data Science is built around the following core principles:

  • Accelerate the work of the Data Scientist

  • Enable collaboration and sharing of models

  • Provide enterprise-grade ML

OCI Data Science enables a complete model life cycle and new features are continually added. OCI Data Science also allows integration with external MLOps frameworks, such as Kubeflow and MLFlow, for tracking and serving. OCI Data Science allows organizations to derive intelligence on all categories of data types. For more details, see Data Science Services.

OCI Data Labeling

OCI Data Labeling is an OCI service to help organizations identify document properties, text, image, and add the appropriate annotations. It works predominantly on this unstructured data. The tags and annotations help machine learning models learn to identify a particular class of objects without tags. Data Labeling is critical for various use cases, including natural language processing (NLP), speech, and vision AI services. For more details, see the Data Labeling documentation.

Big AI

Enterprises small and large are considering developing deep learning models to develop the intelligence of the constant streams of unstructured data. Examples include a large NLP training, such as GPT3, where the data types are predominantly unstructured. These deep learning models require enterprise-grade computing coupled with low network latency and high-speed file systems to train at scale. Developing large models require hundreds and thousands of GPUs supported by low-latency network to enable distributed training. OCI offers a range of NVIDIA GPUs—P100, V100, and A100 as virtual machine (VM) and bare metal instances. OCI offers RDMA over ROCE to minimize the internode latency and offers high throughput up to 180 Gb per second. GPU VMs are supported by Oracle Container Engine for Kubernetes (OKE), enabling distributed training on Kubernetes cluster. OCI also supports open source MLOps frameworks and tools. For more details, see Accelerate distributed deep learning with OCI and Machine Learning at Scale with OCI and Kubeflow.

Conclusion

Not every machine learning workload is the same, and one size doesn’t fit all. Enterprises must carefully evaluate the business problem, the source of data and data categories, existing investments, and align with the appropriate ML services.

For more information, see the following resources: