Data Transforms May 2026 Release: Expanding Iceberg Support with Oracle AI Catalog

Overview

The latest Data Transforms release introduces a focused set of enhancements across AI enablement, Apache Iceberg optimization, Fusion integration, new connectors, and operational security. The goal of this release is simple: help customers move more data, optimize open table formats, enrich data with AI-ready vectors, and manage enterprise credentials more securely.

Snowflake Open Catalog Support for Apache Iceberg

Data Transforms now certifies support for Snowflake Open Catalog, also known as Polaris, as a catalog provider for Apache Iceberg connections. This allows users to write Data Load output into Iceberg tables managed through Snowflake Open Catalog.

With this support, users can create an Apache Iceberg connection, select Snowflake Open Catalog as the catalog provider, and configure the catalog name, REST URL, token path, client ID, client secret, and storage settings. Once configured, the connection can be used as a Data Load target to load Oracle source data into Iceberg tables through Snowflake Open Catalog.

This is an important step for customers standardizing on Iceberg and looking for interoperability across open lakehouse architectures. It gives Data Transforms users a direct path to participate in Snowflake Open Catalog-managed Iceberg environments while continuing to use familiar Data Load patterns.

Vector Generation Support for Images

Data Transforms continues to expand its AI capabilities with OCI Generative AI image embedding support. In addition to existing text embedding support, users can now generate vector representations from image content.

A new Image Embedding Vector operator is available in Data Flow. It uses an OCI Generative AI connection to call the embedding model and write the generated vector into a target column. This enables use cases such as image similarity, product matching, visual search, AI-powered enrichment, and downstream machine learning workflows.

The typical flow is straightforward: create or select an OCI Generative AI connection, add the Image Embedding Vector component to a data flow, configure the model, map the source image field or reference, and write the resulting vector to the target column.

Oracle AI Data Catalog Support

Data Transforms now integrates with Oracle AI Data Catalog as a catalog provider for Apache Iceberg connections. This allows users to load data into governed Iceberg tables managed through Oracle AI Data Catalog.

Users can create an Apache Iceberg connection, select Oracle AI Data Catalog as the catalog provider, and configure the catalog name, service URL, authentication credentials, and storage settings. Once configured, the connection can be used as a Data Load target to load Oracle source data into Iceberg tables.

This integration brings enterprise-grade metadata governance to Iceberg workloads and helps organizations unify data assets across their lakehouse environments. It also strengthens Oracle’s support for scalable, governed, and AI-ready data platforms built on Apache Iceberg.

Iceberg: Parquet Compaction and Clustering

Data Transforms now includes workflow actions for Apache Iceberg Parquet compaction and clustering. This helps customers improve query performance by reorganizing many small Parquet files into larger and better-optimized files.

This matters because repeated incremental loads into Iceberg tables can create many small files. Over time, those files can negatively affect performance. With this release, users can add workflow actions for compaction, clustering, compaction plus clustering, incremental compaction, and incremental clustering.

Compaction helps reduce file fragmentation by combining smaller Parquet files. Clustering organizes related data together so query engines can prune data more efficiently. Together, these capabilities help keep Iceberg tables healthier and more performant after ongoing Data Load operations.

Iceberg: Statistics Support

The release also adds an Iceberg Statistics workflow step. This helps collect or refresh statistics for Iceberg-backed external tables after data changes.

After loading or updating Iceberg data, statistics can become stale. Refreshing statistics helps the database optimizer make better decisions and can improve query execution plans. Users can add the Iceberg Statistics step to a workflow, select the Iceberg connection, choose the Oracle connection and schema containing the external tables, and then refresh statistics for selected tables or all relevant external tables.

This is a practical optimization feature for customers using Iceberg data in analytical query workloads.

OCI Secret Vault Integration

Data Transforms now supports storing connection passwords and secrets in OCI Vault. This reduces the need to store sensitive connection credentials directly in database-managed credentials and aligns with enterprise security practices.

With this integration, users can create a vault and secret in OCI, store the connection password or secret value in OCI Vault, and configure the Data Transforms connection to reference the secret OCID. Data Transforms can then retrieve the secret at runtime, provided the required resource principal setup and permissions are configured.

This improves the security posture for customers who already use OCI Vault for centralized secret management. It also supports better operational practices, such as rotating secrets in OCI Vault without directly exposing credentials in Data Transforms.

Additional Enhancements

The release also includes several other updates worth noting:

New source connectors: Data Transforms adds source-only support for MariaDB and Sage Intacct, both usable in Data Load and Data Flow.
Data Load job panel enhancements: The job details experience has been reorganized to make large multi-table jobs easier to inspect and troubleshoot.
Data Load audit option: New auditing capabilities help improve traceability, governance, and troubleshooting for load operations.
Incentive Compensation and Oracle Fusion Subscription Management, helping automate CSV-based ingestion from object storage into Fusion through import APIs.

In Summary

The May 2026 Data Transforms release strengthens several important areas: open lakehouse interoperability, AI-ready data enrichment, Fusion connectivity, Iceberg performance optimization, and enterprise-grade secret management. Users working with Apache Iceberg should especially review the new Snowflake Open Catalog, compaction, clustering, and statistics capabilities. Customers focused on AI and semantic search should explore the new image embedding support, while enterprise teams should evaluate OCI Vault integration as part of their credential management strategy.

For more details on the features in this release refer to Oracle Data Transforms 2026.03.23.00 release notes.

Data Transforms May 2026 Release: Expanding Iceberg Support with Oracle AI Catalog

Overview

Snowflake Open Catalog Support for Apache Iceberg

Vector Generation Support for Images

Oracle AI Data Catalog Support

Iceberg: Parquet Compaction and Clustering

Iceberg: Statistics Support

OCI Secret Vault Integration

Additional Enhancements

In Summary

Ashish Jain

Senior Principal Product Manager, Autonomous Database

New UX Updates for Autonomous AI Database on Cloud@Customer and Dedicated Infrastructure

Search - The Benefits of a Converged Data Platform

Data Transforms May 2026 Release: Expanding Iceberg Support with Oracle AI Catalog

Overview

Snowflake Open Catalog Support for Apache Iceberg

Vector Generation Support for Images

Oracle AI Data Catalog Support

Iceberg: Parquet Compaction and Clustering

Iceberg: Statistics Support

OCI Secret Vault Integration

Additional Enhancements

In Summary

Authors

Ashish Jain

Senior Principal Product Manager, Autonomous Database

New UX Updates for Autonomous AI Database on Cloud@Customer and Dedicated Infrastructure

Search - The Benefits of a Converged Data Platform