New: Multi-Catalog Data Discovery and Integration

Oracle Autonomous Database provides the perfect analytical engine for cloud data analytics. It is highly available, supports virtually any format of data, and offers highly efficient and scalable data processing, along with a wide range of built-in self service tools for data analysts, engineers and scientists to collaborate to produce insightful data analytics.

However, finding all the right data required for your analytics can be a bit challenging, and sometimes it is difficult to know exactly which data is required, and where it currently resides. 

Where is the data I need?

Finding the right data for analytics can be like finding the right book for a research project. You are not sure exactly what the title of the book you want is, and you are not sure where it might be in stock. What you really need is a unified and connected search tool that searches across all the libraries and book stores in your area, as well as the online book stores, allows you to browse and preview potentially relevant books, and makes it easy to deliver the book to you once you find it.

This is the experience the Autonomous Database Catalog aims to give you when searching for data.

How does it work?

Autonomous Database Data Studio Catalog Explorer will enable you to connect to your existing data storage systems and catalogs, wherever they are, and provide a unified search and discovery interface to that data. Together with Data Studio’s data load tool, it will make it easy to load, link and feed the data that you want into the Autonomous Database, to power enterprise analytics.

The catalog starts with your local, connected Autonomous Database, but you can easily add more catalogs, to allow you to search and discover data more widely. You can add catalogs for:

  • Other Autonomous Databases in your tenancy – you can use the catalog to browse the other autonomous databases and add a catalog connection to any other database you need:

screenshot of the catalog showing a list of autonomous databases

  • Other Databases, both on-premises and in the cloud, such as Oracle databases, Exadata, or Oracle Database Cloud Service instances.

  • Other Data Catalogs, such as an AWS Glue Data Catalog that manages all the metadata for data in AWS, or the OCI Data Catalog that does this for OCI Object Storage. 

  • Connected Cloud Storage, such as files in OCI Object Storage, AWS S3, Azure Data Storage, Google Cloud Platform, and others. 

  • Shared Data from other systems and data marketplaces, for example data shared from another Autonomous Database using Cloud Links, or shared from DataBricks, or any other Delta Sharing system.

Adding catalogs can be done with a few simple steps. The Autonomous Database catalog automatically understands and maps the metadata for these different stores to a standard form for data search and discovery.

Shopping for Data

Once you have connected all the catalogs you need, you can search for data either within a catalog, or across several. The catalog provides a number of ‘Query Scopes’ if you want to limit your searches to certain types of object – and you can customize the catalog to add more scopes. The default scopes are:

  • Tables and Views – search only for database tables (both internal and external), views, and analytic views 

  • Files – search only for files in connected cloud storage

  • Data Objects – search for data objects in the database and on connected cloud storage. This is a superset of the Tables and Views and Files scopes.

  • OCI – browse or search other Oracle Cloud Infrastructure objects, such as other Autonomous Databases, buckets on OCI cloud storage, or a registered OCI Data Catalog

  • Connections – browse or search connections that have already been registered in Data Studio, such as connections to external Data Catalogs or databases

  • All – browse or search all objects in the catalog. 

For example, to find all tables, views or files that may contain customer data, you might select Data Objects and search for cust:

screenshot of a search for customer data in the catalog

You can then filter your search results using the options on the left hand side of the screen. For example, in the search results above, we can see there are 4 tables and over 100 cloud objects (files) that match the search term. To view the tables only, simply click the Table check box:

screenshot showing a refined search for customer data with only tables selected

When you have found an object you are interested in, you can click the object to view its details, or click one of the other available actions from the menu on the right hand side. For example, this is the menu for tables in the local Autonomous Database:

screenshot of the catalog menu for local tables

The options from left to right are:

Icon

Action

Description

View Details

Opens the details of the table, allowing you to preview the data, see Lineage and Impact, and view detailed statistics

Gather Statistics

Gathers statistics on the table for example so that you can see the number of rows in the catalog view above

Add to Share

Adds the table to a new share, for sharing with any Delta Sharing share consumer

Register to Cloud Link

Shares the table with other Autonomous Databases in the same compartment, tenancy, or region

Create Analytic View

Creates an Analytic View from the table

Query

Opens the Analysis application with a default query on the table

Export Data to Cloud

Exports the table data to a file on connected cloud storage

Edit

Edits the table definition

Drop

Drops the table

Making Data Integration Simple

The catalog works with Data Studio’s Data Load tool, so that you can very easily load, link or feed data sets that you find in the catalog into the local autonomous database, using all of the functionality of the tool, such as automatic file-to-table mapping, Personally Identifiable (PI) data detection, sentiment analysis and much more.

For example, if you find a file on cloud storage, or a folder of files, that looks useful, you can use the links to load or link the data into the database so that it is immediately available for query and transformation:

screenshot of the options to load or link a cloud storage file

Or, you can click the file to view its details, and download the file locally using the URI.

Summary

The Autonomous Database catalog provides easy access to data from across the enterprise. You can search, discover and understand connected data sets, and choose which to load or link into the database for further processing, or immediate analysis. As the catalog is integrated with Data Studio, you can easily load, transform and analyze data no matter where it currently resides. 

Don’t forget to bookmark the Autonomous Database Get Started page to try Autonomous Database for free, watch demos, learn with self-service workshops, and keep up with the latest news!