Introduction
Performance benchmarking is a critical step in any cloud adoption journey. Whether you are sizing instances for a database migration, validating that a new compute shape meets your throughput requirements, or comparing storage tiers before committing to a production architecture, you need reliable, repeatable numbers. Yet benchmarking in the cloud is often a manual, error-prone process: engineers SSH into instances, install tools, run tests, copy results into spreadsheets, and then tear everything down — only to repeat the cycle when a new shape is released or a configuration changes.
This blog introduces an open-source Terraform stack purpose-built for Oracle Cloud Infrastructure (OCI) that eliminates that manual overhead entirely. With a single Resource Manager apply, the stack provisions networking, compute instances, block volumes, and then automatically runs industry-standard benchmarks — sysbench for CPU and memory, FIO for storage I/O — collecting and centralising the results in OCI Logging. No SSH sessions, no scripts to maintain, no results lost in terminal history.
The Problem: Manual Benchmarking Does Not Scale
Cloud providers offer an ever-expanding catalogue of compute shapes, storage tiers, and performance options. OCI alone provides dozens of VM and bare-metal shapes across AMD, Intel, and Arm architectures, each with different core counts, memory ratios, and network bandwidth. Block volumes can be tuned from 0 to 120 Volume Performance Units (VPUs) per GB, yielding vastly different IOPS and throughput profiles.
When teams need to evaluate these options, the typical workflow looks like this:
- Manually create a VCN, subnet, and security rules.
- Launch one or more compute instances.
- SSH in, install sysbench or FIO, figure out the right flags.
- Run the benchmark, scroll through terminal output, copy the numbers somewhere.
- Optionally attach a block volume, partition it, format it, mount it, then repeat the FIO test.
- Tear everything down and hope the notes are complete enough to reproduce later.
This process is slow, inconsistent, and difficult to audit. Different engineers may use different tool versions, flags, or durations, making results incomparable. There is no central record of what was tested, when, or with which parameters.
The Solution: A Fully Automated Benchmark Stack
The stack described in this blog is a self-contained Terraform project that is deployed through OCI Resource Manager. It uses a schema.yaml file to render a rich, guided configuration form in the OCI Console — no Terraform CLI knowledge is required. The operator fills in the form, clicks Apply, and the stack handles everything from network creation to results collection.
What Gets Deployed
| Component | Details |
| Networking | VCN, subnet (public or private), Internet/NAT/Service Gateways, route tables, Network Security Group with preset rule sets. |
| Compute | 1 to 20 instances with flex shape support, spread across Availability Domains. IMDSv2 enforced, Oracle agent plugins enabled. |
| Block Volumes | One block volume per instance (optional). Configurable size (50 GB – 32 TB), performance tier (0-120 VPUs/GB), and attachment type (paravirtualized or iSCSI). |
| Sysbench Benchmark | CPU benchmark with configurable threads, duration, and workload intensity. Optional memory bandwidth test. Runs automatically after provisioning. |
| FIO Benchmark | Storage I/O benchmark on attached block volumes. Configurable test pattern (random/sequential read/write/mixed), block size, I/O depth, parallelism, and duration. Includes sequential read throughput baseline. |
How It Works
Infrastructure Provisioning
When the operator clicks Apply, Terraform creates the VCN, subnet, gateways, and security rules first. It then launches the requested number of compute instances, distributing them across Availability Domains in a round-robin fashion for resilience. If block volumes are enabled, they are created in the same AD as their corresponding instance and attached automatically.
Tool Installation via Cloud-Init
Each instance boots with a MIME multipart cloud-init payload that installs three tools: sysbench (from EPEL or built from source as a fallback), FIO (from the OS package manager), and the OCI CLI (for pushing results to OCI Logging). The script creates marker files so the benchmark phase knows exactly when installation is complete. A smoke test confirms each tool works before the marker is written.
Benchmark Execution
Terraform remote-exec provisioners SSH into each instance and run the benchmarks in a deterministic pipeline:
- Wait for cloud-init: Polls for the boot-finished and tool-ready marker files. Times out after 10 minutes if something goes wrong.
- Sysbench CPU: Runs a multi-thread test using all available CPUs (or a specified count), then a single-thread baseline for reference. Extracts events per second, total events, and latency percentiles.
- Sysbench Memory (optional): Measures memory bandwidth with configurable block size and total transfer size.
- FIO Storage I/O: Waits for the block volume device, creates an ext4 filesystem, mounts it, then runs the configured I/O pattern. Also runs a sequential read throughput baseline. Extracts IOPS, bandwidth, and latency for both read and write paths.
- Result Collection: Reads the results file and prints it to the Resource Manager apply logs. Also pushes a JSON payload to OCI Logging using Instance Principal authentication.
Results in OCI Logging
The stack creates a Log Group and separate Custom Logs for sysbench and FIO results. Each instance pushes its results as a structured JSON log entry tagged with the instance name, benchmark run ID, and timestamp. This means you can filter, search, and compare results across instances and runs directly in the OCI Console under Observability & Management > Logging > Log Search — no spreadsheets required.
Secure by Design
Security was a first-class concern throughout the design of this stack:
- SSH Key via OCI Vault: The recommended method for providing the SSH private key is through an OCI Vault secret. The key is retrieved at apply time via the oci_secrets_secretbundle data source and never stored in stack configuration variables. A direct-paste fallback (rendered as a masked password field) is available but only shown when no Vault secret is selected.
- Instance Principal Authentication: Instances authenticate to OCI Logging using Instance Principal — no API keys or config files are stored on the instances. A Dynamic Group and IAM Policy are created automatically (or skipped if they already exist).
- IMDSv2 Enforced: Legacy Instance Metadata Service endpoints are disabled on all instances, mitigating SSRF-based metadata theft.
- Minimal Network Exposure: The NSG uses preset rule sets (SSH-only, SSH+HTTP, etc.) rather than open-all defaults. Private subnet mode with NAT Gateway is fully supported.
Why It Matters
Repeatability
Every benchmark run is defined by Terraform variables: shape, OCPU count, thread count, duration, block size, I/O depth, VPUs per GB. Changing the Benchmark Run ID and clicking Apply produces a new set of results with identical methodology. You can compare shapes, storage tiers, or configuration changes with confidence that the test conditions were the same.
Speed
What used to take an engineer an hour of manual setup now takes a single Apply. The stack provisions infrastructure, installs tools, runs benchmarks, and collects results in one automated pass. Subsequent re-runs (with tools already installed) complete in just the benchmark duration time.
Auditability
Results are not lost in terminal scrollback. They live in OCI Logging with full metadata: which instance, which shape, which parameters, which run ID, when. Teams can review historical results, compare across time periods, and build dashboards on top of the log data.
Accessibility
The Resource Manager schema.yaml renders a user-friendly form with conditional visibility, dropdowns, and descriptions. An operator who has never written a line of Terraform can deploy the stack, run benchmarks, and interpret the results. FIO options only appear when block volumes are enabled. Memory benchmark options only appear when sysbench is enabled. The form guides the user through exactly the choices that matter.
Cost Efficiency
Both sysbench and FIO are open-source (GPL). There are no license fees, no per-run charges, no result upload requirements. The only cost is the OCI infrastructure itself, which can be torn down immediately after the benchmark completes.
Use Cases
- Shape Evaluation: Compare events-per-second across E4.Flex, E5.Flex, A1.Flex, and Optimized3.Flex to find the best price-performance ratio for your CPU-bound workload.
- Storage Tier Selection: Run FIO with 10 VPUs/GB vs. 60 VPUs/GB vs. 120 VPUs/GB on the same shape to quantify the IOPS and latency improvement before committing to a higher tier.
- Capacity Planning: Scale from 1 to 20 instances with identical parameters to validate that per-instance performance remains consistent under multi-tenancy.
- Regression Detection: Schedule periodic benchmark runs with the same Run ID pattern to detect performance regressions after platform updates or maintenance events.
- Pre-Migration Validation: Before migrating a database or application, run the exact I/O pattern (block size, depth, read/write mix) that your workload produces to confirm the target environment meets requirements.
Conclusion
Cloud benchmarking should not be a manual chore. By encoding the entire workflow — infrastructure provisioning, tool installation, benchmark execution, and result collection — into a single Terraform stack with a guided Resource Manager form, this project turns performance testing into a repeatable, auditable, one-click operation.
The stack is also designed to be extensible — the modular cloud-init and benchmark pipeline pattern means that additional tools (such as iperf3 for network throughput, stress-ng for advanced CPU stress testing, or custom application-specific benchmarks) can be integrated following the same approach with minimal effort. Whether you are an architect evaluating shapes for a new project, an operations engineer validating storage performance before a migration, or a platform team building a benchmarking pipeline, this stack provides the foundation. Fork it, extend it with your own workloads, and make data-driven infrastructure decisions with confidence.
