Part of Oracle Cloud Infrastructure (OCI) AI services, OCI Language service allows you to perform sophisticated text analysis at scale without any machine learning (ML) knowledge. It provides pretrained models for sentiment analysis, entity extraction, language detection, and many other natural language tasks. Today, we’re excited to announce two exciting capabilities in the Language service: Customizable models and automatic text translation. These features are now in limited availability in all OCI’s commercial regions and can be accessed through OCI software developer kits (SDKs) and REST APIs.
OCI Language enables you to train customized language models, even if you’re not a natural language processing (NLP) expert. You can train custom classification and custom named entity recognition (NER) models. The OCI Language service learns from previously labeled data, such as classified records and samples of entities extracted from text, to train the models. After you train a custom model, you can deploy dedicated endpoints to serve your requests.
Imagine that you’re responsible for the support tickets that come into your company. Each day, you receive thousands of tickets that need to be routed to specific departments. In the past, you had humans perform this task. With this tedious work, keeping your employees excited about their work is difficult. Over the last couple of months, your employees have routed thousands of tickets to the right department. Wouldn’t it be great if AI could help them?
Using OCI Language, you can create a custom model that learns from all the work the humans have done in the past.
When the model is trained on previous conversations, it can classify any new requests automatically, freeing humans to perform less tedious tasks.
OCI Language allows you to host the custom models on dedicated endpoints. Depending on the expected throughput you need to handle, you can assign a larger or smaller number of inference units (a unit of compute) to your endpoint.
We described a support ticket classification scenario, but the same principles apply to other use cases, such as document classification, clause classification, and intent recognition.
OCI also supports the ability to identify terms that are unique to your domain, such as product part codes, manufacturing terms, and specific financial entities. You provide sample data with labeled entities to train a custom NER model that can then be used to automatically identify the entities in text.
To illustrate the types of problems that using custom NER can solve, let’s continue our support ticket use case. Imagine that many of the support tickets you receive deal with shipment issues. You want to extract critical information from each ticket, such as the order ID, the shipment date, and the name of the recipient.
Using OCI Language, you can automate this process, but first you must gather the training data. OCI Data Labeling, a service for labeling datasets, can help you label the data to train such a model. It allows you to define the custom entities, and then mark the location of those entities in the text, as shown for the following examples:
With the labeled data, you can train your own custom NER model. OCI Language provides an intuitive workflow to create and organize models. You can also evaluate global and entity-specific metrics to help you identify other data that you need to further improve your model.
When you’re pleased with the quality of the model, you can create a dedicated endpoint to automatically perform the entity extraction for you. So, you to convert unstructured data (language prose) into structured data, enabling you to automate downstream processes!
OCI Language now allows you to automatically translate text across 21 languages. This exciting feature uses state-of-the-art AI neural machine translation techniques to translate text at scale with high accuracy. Sample use cases of automatic text translation include building multilingual chatbots, automatic application localization, translation of support tickets, article translation and any kind of application that helps you understand others globally.
We’re always expanding our language coverage. At the time of this blog post, the following languages are supported:
Brazilian Portuguese (pt-BR)
Canadian French (fr-CA)
Traditional Chinese (zh-TW)
You can experience the text translation capabilities from the Oracle Cloud Console.
We encourage you to try these exciting new capabilities. They’re available to you from the console and through a variety of SDKs and REST APIs. For more information, see the following resources: