Announcing Oracle Cloud Infrastructure Language for AI-powered text analysis

May 6, 2021 | 6 minute read
Shahid Reza
Director, Software Development
Text Size 100%:

We’re excited to announce the general availability of Oracle Cloud Infrastructure (OCI) Language, which allows customers to uncover insights in unstructured text using natural language processing (NLP). Developers can integrate pretrained NLP capabilities into applications, without needing data scientists to create customized models. By automating text analysis at scale, customers can gain insights for improving customer experience and increasing efficiency.

You can access the service either through the OCI console, OCI SDKs in Python, Java, Go, Typescript, Dotnet, or Powershell, or the OCI-CLI.

This service is the first of a family of upcoming OCI AI Services, which will make it easy for businesses to add prebuilt AI capabilities to applications for a variety of industry use cases. To learn more about the AI Services, watch the keynote from Developer Live: AI and ML for Your Enterprise

Let’s look at the key capabilities and use cases for OCI Language.

OCI Language in the OCI Cloud Console menu

OCI Language is visible in the main menu of the OCI Console


OCI Language pretrained tools


Read the OCI Language documentation


Why use AI for text analysis?

Text data, such as social media posts, news, and surveys, provide valuable business and customer insights. However, it’s often too time-consuming for humans to analyze large amounts of textual data, so companies turn to NLP to gain insights effectively and at scale.

To use these NLP capabilities, they rely on data scientists to build and train custom machine learning models, then deploy these models into applications. This process is often time-consuming and expensive.

OCI Language reduces this time and effort by providing key language processing capabilities as production-ready REST APIs. These capabilities include:

  • Language Detection: Detects 100+ languages from input text.
  • Named Entity Recognition: Inspects the given text for known entities like people, places & PII data.
  • Sentiment Analysis: Inspects the text and identifies various aspects of the product to determine if the response is positive, negative, or neutral.
  • Key Phrase Extraction: Quickly identifies the important words and phrases in input text.
  • Text Classification: Analyzes the text and classifies the text into a set of categories.

Let’s dive into more detail about these key capabilities.  


Language Detection

Within OCI Language, the Language Detection feature indicates the language of a given text. This feature is currently able to identify 100+ languages. Language-specific user models/services can use our Language Detection service as a vital preliminary step. Other OCI services can use Language Detection to help identify any language-related barriers before making deep inference of the provided blurb of text.

Language Detection


Named Entity Recognition

Named Entity Recognition (NER) enables you to identify and categorize entities in the text as people, places, organizations, date and time, quantities, percentages, or currencies. This feature is currently able to identify 19 entity types and 5 personally identifiable information (PII) entities.  NER automatically scans entire articles to identify significant entities. The PII feature identifies and redacts sensitive entities in the text, such as a phone number, email address, mailing address, or URL.

Named Entity Recognition


Text Classification

Text Classification identifies the topic for the given input document/text. It returns a category with the confidence score from a set of the predefined categories. This capability can be easily used to tag your textual data by identifying the topic of a given text, making it easier to classify and retrieve more relevant information.

Text Classification will classify the content of a collection of documents to determine common themes. For example, you can classify a collection of news articles to News and Media/Business News/Company News; therefore, it will determine the theme and will show the results in primary/secondary/tertiary form with in-depth insights of the documents.

Text Classification


Key Phrase Extraction

Key Phrase Extraction identifies essential phrases in unstructured input text. It helps customers gain a deep understanding of customer key intent and attributes in a glance. This service is built on cutting-edge technological advancements in the deep learning space, and it helps define context-based intent/aspect of a given text. Key phrases help summarize the content of a text and reveal the main topics. This feature makes it easy to extract the most relevant words from text and find insights related to the text's overall main points.

Key Phrase Extraction


Aspect-Based Sentiment Analysis

The Aspect-Based Sentiment Analysis feature extracts the critical components of text and provides the associated sentiment – either positive, negative, or neutral.

With this aspect-based sentiment analysis, businesses can become customer-centric. Aspect-Based Sentiment Analysis is vital in understanding feedback in reviews, surveys, and social media posts. It would help companies listen to their customers’ needs, analyze their feedback, and learn more about customer experiences and their expectations for a product or service.

Aspect-Based Sentiment Analysis


An example using all features

Here’s an example showing what all of the features would return for this input text:

A booster shot of Moderna's Covid-19 vaccine revs up the immume response against two worrying coronavirus variants, and a booster dose formulated specifically to match the B.1.351 variant first seen in South Africa was even more effective, Moderna says"  

OCI Language example using all features

Language Detection indicates the language as English, with about a 0.99 confidence score. 

Text Classification identifies the main category as "health and medical/conditions and disease."

Named Entity Recognition finds five entities: "Moderna's Covid-19" as a product, "two" as a cardinal (meaning a cardinal number), "B.1.351" as a product, "South Africa" as a GPE (geopolitical entity), and "Moderna" as an organization. 

Key Phrase Extraction identifies "booster shot of Moderna's Covid-19 vaccine", "immune response", "coronavirus variants", "booster dose", "B.1.351 variant", "South Africa", and "Moderna" as important phrases.  

Finally, Aspect-Based Sentiment Analysis extracts "booster shot", "vaccine", and "booster dose" as aspects, all with a positive sentiment.  


Serve customer more effectively using text analysis

OCI Language will help businesses improve their customer experience while reducing the time and effort to analyze text data.

The service has multiple use cases across lines of business, including:

  • Marketing: Analyze social media, reviews, and news to see what customers and industry experts are saying about your product. See what they do and don’t like, what new features they want, and how you compare to your competitors.
  • Customer support: Classify support tickets by product and department, so that tickets get to the appropriate team faster. Use sentiment analysis to identify urgent pain points and prioritize tickets.
  • Human resources: Automate resume screening by using entity recognition to identify key skills and education. Classify employee feedback using sentiment analysis and entity recognition to identify the most common pain points among employees and the best next steps to take.

We will publish additional blogs to deep dive in some of these individual areas in coming weeks. Stay tuned …

Try Oracle Cloud Infrastructure (OCI) Language for free using $300 in cloud credits


Shahid Reza

Director, Software Development

Previous Post

Computer vision examples using Oracle's data science platform

Henry Yin | 8 min read

Next Post

Oracle Cloud delivers innovations at NVIDIA’s GTC21

Andrew Butterfield | 5 min read