Working with data stored outside your database—such as in cloud object storage, data lakes, or distributed file systems—offers flexibility. For example, you can instantly access data where it lives, share datasets across teams, and connect to multiple sources or clouds without copying it into your database. However, querying external data usually introduces higher latency and less performance than data stored inside a database. To address this, many organizations copy or replicate external data into their databases. While this can accelerate queries, it also requires building data pipelines, managing ongoing syncs, and handling data consistency challenges—ultimately adding cost, complexity, and risk. 

Meet External Table Cache: A Simpler, Smarter Way 
XTC

Now there’s a simpler and more cost-effective way to solve this challenge. With the external table cache functionality in Autonomous AI Database, you can combine the power of external data processing with in-database performance—without extra pipelines, manual refreshes, or application changes. 

What is External Table Cache and Why Does it Matter? 

The external table cache is a storage area within your Autonomous Database that holds frequently accessed data from external tables (e.g., CSV, Parquet, ORC, AVRO, Iceberg). It enables you to bring your most important external data closer to your analytics—transparently and securely. Cache management is controlled by easy-to-set policies. Oracle  Autonomous AI Database manages the cache lifecycle, automatically maintains the freshness of the cached data, and provides visibility into what’s in the cache for you.  

Technical teams love external table cache because you get: 

  • Boosted performance for analytics: Caching external frequently-used data inside the database delivers query speeds orders of magnitudes faster, making it ideal for BI dashboards, recurring analyses, and regular reporting. 
  • Transparent query acceleration: Your queries, dashboards, and apps work as before, without any rewrites. Autonomous AI Database handles everything under the hood. 
  • Granular, flexible caching control: Cache all files, a percentage, or just the latest data. Use procedures to control precisely what’s cached, cache size, and storage limits. 
  • Data freshness through automated updates: The cache stays synchronized with the latest changes in your external data. New or updated files can be easily repopulated, while data from deleted or inaccessible files is automatically invalidated, so queries always reflect current and accurate information. 
  • No manual cache management required: The database automates cache population, refresh, and cleanup—eliminating the need for custom scripts, manual tracking, or intervention. 
  • Efficient cache management: Easily repopulate, retire, or drop cached files based on time or usage. Monitor cache status using views. When files are removed or credentials revoked, cached data is invalidated and inaccessible, adhering to the organization’s security and compliance needs. 

Business teams love external table cache because: 

  • Faster insights, better decisions: By eliminating latency and repeated cloud fetches, teams get answers faster—enabling timelier decisions and increased business agility. 
  • Lower cloud and data transfer costs: Caching cuts down on repeated downloads of large files from object storage, reducing egress fees—especially valuable for multi-cloud scenarios and high-volume environments. 
  • Simplicity, reduced risk, and governance: No need to alter applications or ETL pipelines—minimizing deployment risks and ongoing maintenance. Caching is schema-local, so existing governance models and user controls stay intact. 
  • Future-proof analytics infrastructure: As workload and data volumes grow, policy-driven and automatic caching (coming soon) optimizes cost and performance without additional engineering effort. 

How It Works – Quick Example 

Let’s say your analytics team regularly queries Parquet files in Object Storage. Here’s how to accelerate query performance in five simple steps. 

1. Create an external table

First, create an external table that points to the Parquet files: 

BEGIN 
  DBMS_CLOUD.CREATE_EXTERNAL_TABLE( 
    table_name      => 'MY_PARQUET_XT', 
    credential_name => 'MY_CLOUD_CRED', 
    file_uri_list   => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/mybucket/b/myfolder/o/*.parquet', 
    format => '{"type":" parquet", "schema": "first"}'); 
  ); 
END; 

2. Create the cache 

This step just makes an empty cache for your external table. No data is loaded yet; this simply initializes the cache. 

BEGIN 
  DBMS_EXT_TABLE_CACHE.CREATE_CACHE( 
    owner          => 'MY_SCHEMA', 
    table_name     => 'MY_PARQUET_XT', 
    partition_type => 'PATH' 
  ); 
END; 

3.Populate the cache 

Now, fill the cache with data. You can load data from some files, the newest files, or all files—based on your organization’s needs. 

-- Add 50% of the files: 
BEGIN 
  DBMS_EXT_TABLE_CACHE.ADD_TABLE( 
    owner         => 'MY_SCHEMA', 
    table_name    => 'MY_PARQUET_XT', 
    percent_files => 50 
  ); 
END; 

Note: Non-cached and latest data remain accessible. For example, even when you cache only part of the table (like 50% of files), queries still return all data. Data in the cache is accessed faster, while other data is read as needed from Object Storage. 

-- Or add only the latest files (last 7 days): 
BEGIN 
  DBMS_EXT_TABLE_CACHE.ADD_LATEST_FILES( 
    owner      => 'MY_SCHEMA', 
    table_name => 'MY_PARQUET_XT', 
    since      => INTERVAL '7' DAY 
  ); 
END; 

4. Keep your cache fresh 

Quickly drop old cached files—for example, those older than 30 days: 

BEGIN 
  DBMS_EXT_TABLE_CACHE.RETIRE_FILES( 
    owner      => 'MY_SCHEMA', 
    table_name => 'MY_PARQUET_XT', 
    before     => INTERVAL '30' DAY 
  ); 
END; 

Remove the cache completely: 

BEGIN 
  DBMS_EXT_TABLE_CACHE.DROP_CACHE( 
    owner      => 'MY_SCHEMA', 
    table_name => 'MY_PARQUET_XT' 
  ); 
END; 

5. Monitor cache status 

SELECT external_table_name, 
       cached, 
       cache_cur_size, 
       last_refresh_time 
FROM   user_external_tab_caches; 

That’s it—you don’t need any app changes or extra pipelines to take advantage of caching, keeping dashboards and analytics running fast and up-to-date. 

What’s Next? 

Today, external table cache is policy-based, giving you full control of what’s cached and when. Soon, automatic caching will leverage usage analytics to optimize performance even further—Autonomous AI Database will automatically cache your most-used data sets. 

Get Started 

External table cache provides significant value for teams that frequently access external data through dashboards and reports. With external table cache, you benefit from consistently high performance and up-to-date results—all without additional development or maintenance effort. Try external table cache on your key datasets today to see how it can make your analytics faster, simpler, and more cost-effective. 

For more information, check out the documentation