Archiving streams on OCI

Streaming in Oracle Cloud Infrastructure (OCI) provides a highly scalable and reliable solution for ingesting and consuming high-throughput data in real time. A glance at the Streaming service limits documentation tells you that you can retain the messages in a stream for only 7 days (at the time this post was written). Although this limit is good enough for most use cases, sometimes you need to retain the data in stream for an extended period, such as months or years.

This post provides one solution for how you can store a stream of data for an extended period.

Overview

This solution uses the Streaming service, ServiceConnector Hub, and Object Storage.

Streaming provides the source of data that needs to be stored for an extended period.
Object Storage is the destination where the data is stored or archived. You can use the storage tiers and lifecycle policy rules based on the access pattern of the archived stream.
Service Connector Hub acts as an orchestrator to move the data from a stream to the Object Storage.

All of these services are managed services in Oracle Cloud Infrastructure, which means that you don’t need to worry about the scalability and the reliability of the solution.

Set up the solution

First, set up a stream or use an existing one. For this demo, I created a stream called FirstPublicStream. It’s a public stream with two partitions and a default retention policy of 24 hours.

A screenshot that shows the details page in the OCI console for the example stream.

Next, set up an Object Storage bucket as the location to store the stream data.

A screenshot that shows the Object Storage page in the console with the Create Bucket button highlighted.

For this demo, I created a bucket called StreamingBucket. It’s a private bucket and is in the standard tier.

A screenshot that shows the details page in the OCI console for the example bucket.

Finally, set up a service connector in Service Connector Hub that takes the stream (FirstPublicStream) as the source and the Object Storage bucket (StreamingBucket) as the destination. Optionally, you can provide a reference to the function, but for simplicity we’re omitting the function from this demo.

A screenshot that shows the Service Connector Hub page in the OCI console, with the example service connector listed.

A screenshot that shows the service connector details that lists the source and target.

After this setup is done, the service connector runs in a continuous mode and transfers the messages in the stream to the Object Storage bucket.

Validate and monitor the solution

After you set up the solution, you can validate and monitor the solution in several ways.

Check stream metrics

Check the stream console metrics for the put messages and the get messages. Because I don’t have any consumers of the stream except Service Connector Hub, the get message graph depicts the get requests from Service Connector Hub.

A graph that shows put message success and put message failure.

A graph that shows get message success and get message failure.

Check bucket metrics

Check the Object Storage bucket metrics. The data in the bucket is stored in partitions. Each partition has a zip file that contains the file with the JSON record that maps to a message in the stream.

A graph that shows bucket size and the number of objects.

A screenshot of the console that shows a file hierarchy.

Check service connector metrics

Service Connector Hub console metrics show the amount of data moved from the source (stream) to the destination (bucket). The following graph shows that the service connector is writing to the destination only a few times, because the service connector is batching the messages into a zip file before writing to the bucket.

A graph that shows the number of messages read from the source.

To try this solution yourself, sign up for the Oracle Cloud Free Tier or sign in to your Oracle Cloud account.

Archiving streams on OCI

Overview

Set up the solution