Oracle GoldenGate Data Streams introduced with Oracle GoldenGate 23ai is a next-generation, cloud-ready feature designed to provide developers and data scientists with direct, real-time access to transactional data captured by Oracle GoldenGate. By utilizing the AsyncAPI specification and modern protocols like WebSocket, Data Streams enables seamless subscription to database change events through a publish/subscribe model. Updates are delivered instantly as soon as changes are committed in the source database, reducing latency, and eliminating the need for intermediary systems.
GoldenGate Data Streams supports flexible data formats, such as JSON, to ensure effortless integration with existing tools, workflows, and frameworks. Its standards-based architecture streamlines data ingestion, simplifies application development, and ensures data integrity by reliably replicating committed changes for real-time analytics and event-driven operations.

 

Benefits for Developers and Data Scientists

GoldenGate Data Streams enables real-time streaming of database change events directly to subscribing applications and platforms. Unlike traditional messaging systems that use an internal message store or queue, GoldenGate DataStreams sends change records directly from source trail files as they are generated. This architecture is optimized for:

  • Low-latency event processing
  • Decentralized data distribution (data mesh)
  • Microservice and API-driven ecosystems
  • On-demand integration and analytics

How to Use:

  1. Configure a Data Stream:
    Use Oracle GoldenGate Distribution Service to create a data stream from your database. Set options like encoding, buffer size, and Quality of Service (QoS).
    Refer to Add Data Streams for more information.
  2. Connect Using AsyncAPI:
    Utilize the AsyncAPI specification provided for your data stream to auto-generate client code in your preferred language.
  3. Consume Change Events:
    Applications subscribe to the stream, receive data in near real-time, and process or route changes as needed.

 

Understanding Data Stream Position Tracking and Recovery

Oracle GoldenGate Data Streams empowers client applications with granular control over where to begin data streaming both on initial connection and whenever a stream is restarted. This design eliminates the need for a persistent internal message queue, streaming directly from GoldenGate trail files and letting clients instruct the service on exactly what data to deliver.

When a client connects (or re-connects), it includes a begin parameter in the WebSocket URL or API request. This parameter informs GoldenGate Data Streams where to start or resume streaming. You can specify:

  • Special keyword “now”
    Begin streaming from newly committed changes, excluding older available records. Use this when you want only future data.
  • Special keyword “earliest”
    Start from the very first change available in the trail files, ensuring you receive all historical events still retained by the service.

  • ISO 8601 format timestamp string
    Provide an exact date and time (UTC, ISO 8601 format) to stream from a specific moment in your data’s history. GoldenGate will locate the event that matches your provided time, or from the next ”l available record if there is no exact match.

  • Last processed LCR position
    For seamless recovery and precise resume points, use the “LCR position” from the last change you processed. This position returned within each streamed record encapsulates metadata such as commit sequence number, transaction ID, and record number.

Note: If you provide an LCR position, but the corresponding trail files for that record have already been removed or archived, the consumer will hang and will not be able to resume streaming from that position because the record cannot be found.

 

How It Works Behind the Scenes:

Every streamed record (excluding metadata) includes an opaque position marker. The client is responsible for tracking the last processed position. On a regular or recovery connection, pass your saved position in the begin parameter.

  • For “now”, GoldenGate maps this to the current system time; only fresh data is streamed.
  • For “earliest”, the system starts with the oldest available record in the current trail files.
  • For a timestamp, GoldenGate locates and starts from the closest possible event at or after the provided time.
  • For a stored position, the service resumes exactly where you left off, ensuring no changes are missed or duplicated. This is vital for reliable stream processing and recovery.

Careful tracking and use of LCR positions by your client enables robust, at-least-once delivery and repeatable data processing in the face of disconnects or application restarts.

Example connection:

Begin value Example URL
_now_ wss://datastream.example.com:443/ggs/v1/stream?begin=_now_
_earliest_ wss://datastream.example.com:443/ggs/v1/stream?begin=_earliest_
ISO Timestamp wss://datastream.example.com:443/ggs/v1/stream?begin=2024-06-07T11:00:49.034948Z
LCR Position wss://datastream.example.com:443/ggs/v1/stream?begin=G-AAAAILHXAQAAAAAAQ…390312011592369208.10.3.10172

 

 

 

 

 

 

 

Let’s see these options in action:

  1. Configure the stream by specifying the source trail, encoding format, buffer size, and desired quality of service parameters. For detailed guidance, refer to the official Oracle GoldenGate documentation.
    After creating the Data Stream, download the provided AsyncAPI specification (YAML) from the Distribution Service. Use this YAML file to auto-generate client code or configure your application to subscribe and consume real-time change data from the stream endpoint.
     

DS Create

 

2. Use your preferred programming language or framework—such as Node.js, Python, Java, or Go—to connect to the Data Stream endpoint. Set the begin=earliest parameter in your connection URL or request to ensure the client processes all records from the beginning of the source trail. Refer to the AsyncAPI YAML definition to correctly handle message schemas and streamline integration with your chosen development environment.
 

DS Earliest

 

3. Next, configure your client to start streaming records from a defined point in time using the begin=<timestamp> parameter (for example, begin=2025-10-15T10:15:17.000000Z).

Note that multiple records can share the same timestamp, as all changes within a single database transaction will have the identical commit timestamp. Your client should be prepared to process a batch of records corresponding to that timestamp when starting from this position.
 

{
    “qual_table”: “OGGDB_PDB1.MYAPP.MY_JSON_TABLE”,
    “op_type”: “INSERT”,
    “op_ts”: “2025-10-15T10:15:17.000000Z”,
    “pos”: “G-AAAAAMUMAAAAAAAAAAAAAAAAAAAHABMAAA==29033131557587871.5.19.803”,
    “xid”: “1557587871.5.19.803”,
    “after”: {
               “ID”: 9,
               “DATA”: “{\”name\”:\”John Doe\”,\”age\”:34,\”active\”:true}”
             }
  },
{
   “qual_table”: “OGGDB_PDB1.MYAPP.MY_JSON_TABLE”,
   “op_type”: “INSERT”,
   “op_ts”: “2025-10-15T10:15:17.000000Z”,
   “pos”: “G-AAAAAMUMAAAAAAAAAQAAAAAAAAAHABMAAA==29033131557587871.5.19.803”,
   “xid”: “1557587871.5.19.803”,
   “after”: {
              “ID”: 10,
              “DATA”: “{\”name\”:\”John Doe\”,\”age\”:34,\”active\”:true}”
            }
},


Begin TS

 

4. The final option is to resume consumption using the exact position of the last processed Logical Change Record (LCR). Each streamed record contains a pos field such as:

  {
    "qual_table": "OGGDB_PDB1.MYAPP.MY_JSON_TABLE",
    "op_type": "INSERT",
    "op_ts": "2025-10-15T10:15:17.000000Z",
    "pos": "G-AAAAAMUMAAAAAAAACQAAAAAAAAAHABMAAA==29033131557587871.5.19.803",
    "xid": "1557587871.5.19.803",
    "after": {
                "ID": 18,
                "DATA": "{\"name\":\"John Doe\",\"age\":34,\"active\":true}"
             }
},


By capturing and storing the pos value from the last successfully processed record, you can restart your consumer by specifying this value in the begin parameter of your next Data Stream connection. This method ensures the stream resumes precisely from the next unprocessed record, enabling seamless recovery and safeguarding against data loss or duplication.
 

BEGIB POS

 

Why This Model Works

  • Decoupled and Scalable: Consumers independently manage recovery and start points, making architecture flexible and scalable.
  • Optimized for Fresh Data: Most customers use GoldenGate Data Streams to power operations that need live, up-to-date changes—direct from the source.
  • Operational Simplicity: No management of intermediate queues or buffers; focus is on high-speed delivery and application-driven checkpointing.

 

Client Responsibilities & Best Practices

  • Track Last Processed Position: Store the position of the last successfully processed Logical Change Record (LCR) in your application or an external system.
  • Monitor Trail File Availability: Ensure your retention settings align with consumers’ needs. If a client resumes after trail files have rotated out, recovery starts from the earliest available position.
  • Apply Appropriate QoS Settings:
    • Exactly Once: No duplicates, but if a record’s position is missing, streaming cannot resume from that exact point.
    • At Least Once: Possible duplicates if restarting; fails if position is missing.
    • At Most Once: Skips to the next available record if an exact position is missing; no duplicates.

Schema Record Delivery

GoldenGate Data Streams enriches the stream by delivering four types of schema records to ensure your consumer knows how to interpret all subsequent data:

  • DML operation records (data changes)
  • DDL operation records (schema changes)
  • Object metadata records
  • Data stream metadata records

 

Summary: Fast, Flexible Data Streaming with Client-Managed Recovery

Oracle GoldenGate Data Streams delivers a robust, standards-based data streaming platform for real-time enterprise integration. By empowering applications to control start and restart points and by streaming directly from live trail files it enables low-latency, scalable, and resilient data integration tailored to your needs.

Resources :