Data is gold! But only if you can properly analyze it to make it useful.
About 80% of the worlds’ data is in an unstructured format: Video files, PDFs, spreadsheets, images, and so on. Many of these unstructured documents contain plain text written in prose. Even databases have many semi-structured tables because they contain textual fields, such as descriptions or customer support requests. Dealing with large quantities of text data is a complex task, but this information is valuable!
What if we could apply state-of-the-art AI models to unstructured data to extract insights from it? Oracle Cloud Infrastructure (OCI) now offers several AI services that allow you to harness the power of artificial intelligence. They offer pretrained models ready for you to use and easy-to-train custom models that any developer can tweak. AI Services are geared toward developers, and you don’t need to be an expert data scientist to consume them.
In this blog post, we describe a pattern where we use OCI Language, one of our AI services, as part of an enrichment pipeline that allows you to ingest unstructured textual data and extract insights from it to analyze and visualize the extracted insights.
Let’s take the common task of dealing with customer feedback, a problem common to many industry verticals. Imagine that you’re working in the hospitality business. You manage several hotels, and you have received thousands of reviews for your properties. A treasure of information! But only if we can convert those reviews into actionable insights!
OCI Language can perform sophisticated text analysis at scale. It offers pretrained models, so you don’t need to be a data scientist to harness the power of state-of-the-art models, such as sentiment analysis and named entity extraction.
OCI Data Integration is a service that allows you to define custom data transformation flows. For our hotel reviews problem, you can create a data flow to read your unstructured data, call OCI Language to extract insights from the text, and then project the extracted insights into structured tables in a database, as shown in the following data flow.
When Data Integration runs this flow, the output of the AI services ends up in structured tables that you can use for analytics purposes. For example, you can have a table with a set of records for each aspect found in a sentence and their respective sentiments, as shown for the following example.
Transforming the thousands of unstructured reviews into structured formats, such as the aspects table, opens the door to use the data for scenarios, such as data analytics, train machine learning models, and search. In this specific scenario, we can load the data into Oracle Analytics Cloud to visualize the insights and explore the information in a way that allows you to identify actionable tasks.
The pattern exemplified by this customer feedback analytics scenario —transforming information using AI capabilities—is core to dealing with unstructured content. As shown in this example, Oracle Cloud Infrastructure provides you the tools that you need to perform advanced analytics at scale. With AI services, you don’t even have to be data scientist to seize the benefits of artificial intelligence!
For more information, see the following resources: