Join us in test driving the new Oracle Cloud Infrastructure (OCI) GPU shapes with 3D graphics rendering, animation, and ray-tracing capabilities. We walk you through simple rendering and animation tasks using the Blender classroom samples, summarize the performance findings, and showcase the GPU-based rendering in relation to CPU-based rendering.
OCI provides several bare metal and virtual machine (VM) instances powered by NVIDIA A10 Tensor Core GPUs suitable for a variety of accelerated workloads, including artificial intelligence (AI), machine learning (ML) inferencing, computational fluid dynamics (CFD), and virtual desktops when paired with NVIDIA RTX Virtual Workstation software.
Of particular interest to this blog are the OCI GPU Compute shapes based on NVIDIA A10 Tensor Core GPUs, which are best suited for 4K video and gaming and related AI applications such as Stable Diffusion and NVIDIA Omniverse. These shapes on OCI network infrastructure also support SMPTE ST 2110 uncompressed video transport for real-time video production and playout applications. The A10-based GPU shapes offer superior price-performance with a list price of $2 per hour per GPU for pay as you go (PAYG) instances.
The A10 GPUs are available in a bare metal shape (BM.GPU.A10.4) with quad A10 GPUs, supported by 96 GB of GPU memory on a system powered by Intel Xeon Platinum 8358 processors with a total of 64 OCPUs (equivalent of 128 vCPUs), 1 TB of CPU memory, two NVMe drives totaling 7.68 TB of local storage, and two 50 Gbps network bandwidth.
The A10 GPUs are also available as VM instances with the following features and benefits:
VM.GPU.A10.1 with one GPU and 24 GB of GPU memory supported by 15 OCPUs, 240 GB of CPU memory, and 24Gbps of network bandwidth
VM.GPU.A10.2 with two GPUs and 48 GB of GPU memory supported by 30 OCPUs, 480 GB of CPU memory, and 48 Gbps of network bandwidth
The VM shapes only come with block storage.
Check out an excellent blog post by Jeff Davies about running Blender on Oracle Cloud Infrastructure. We test drove the new A10 GPU VM shapes on OCI for gathering metrics for sample rendering tasks using the following methodology:
Provisioned the wanted Compute shapes, VM.GPU.A10.1 and VM.GPU.A10.2, with Ubuntu 20.04 as the operating system of choice.
Installed NVIDIA drivers.
Installed Blender 3.5.0.
Downloaded the classroom demo project from the Blender website.
Ran the Blender commands to render a single frame and multiple frames of the classroom object, which uses CYCLES and NVIDIA OptiX ray-tracing functionality.
The A10 GPU-based shapes excel at rendering tasks as evidenced by the metrics, illustrating the difference in speeds between GPU- and CPU-based rendering. The Blender command line interface used the following Python script to set OptiX as the render engine for ray tracing and use GPUs for rendering:
rendersettings.py
import bpy
prop = bpy.context.preferences.addons['cycles'].preferences
prop.get_devices()
prop.compute_device_type = 'OPTIX'
for device in prop.devices:
if device.type == 'OPTIX':
device.use = True
else:
device.use = False
bpy.context.scene.cycles.device = 'GPU'
for scene in bpy.data.scenes:
scene.cycles.device = 'GPU'
For single-frame rendering using the Blender command line headless invocation, we ran the following commands for CPU-based rendering:
blender -b classroom.blend -o //classroom -f 1 -F PNG -noaudio > blender-CPU-F1.log
For GPU-based rendering on A10 GPU-based VMs, we ran the following command:
blender -b classroom.blend -o //classroom -f 1 -F PNG -noaudio -E CYCLES -- --cycles-device OPTIX --cycles-print-stats -P ~/rendersettings.py --debug-cycles > blender-GPU-F1.log
We also ran multiple-frame rendering using the Blender command line headless invocation. For CPU-based rendering of multiple frames, we ran the following command:
blender -b classroom.blend -o //classroom -s 1 -e 6 -F PNG -noaudio -a > blender-CPU-F6.log
Then we ran GPU-based rendering for multiple frames:
blender -b classroom.blend -o //classroom -s 1 -e 6 -F PNG -noaudio -a -E CYCLES -- --cycles-device OPTIX -P ~/rendersettings.py > blender-GPU-F6.log
Running these commands gathered the following information. In this table, an OCPU is equivalent to one physical core of a processor with hyper-threading enabled. An OCPU corresponds to two hardware processing threads or vCPUs.
Instance Type |
GPU/CPU |
Frames |
Rendering time |
---|---|---|---|
VM.Standard.E4.Flex |
16 OCPU |
1 |
3 minutes 48 seconds |
VM.Standard.E4.Flex |
32 OCPU |
1 |
1 minutes 55 seconds |
VM.GPU.A10.1 |
One A10 GPU |
1 |
19.3 seconds |
VM.GPU.A10.2 |
Two A10 GPUs |
1 |
11.4 seconds |
VM.Standard.E4.Flex |
16 OCPU |
6 |
22 minutes 42 seconds |
VM.Standard.E4.Flex |
32 OCPU |
6 |
11 minutes 30 seconds |
VM.GPU.A10.1 |
One A10 GPU |
6 |
1 minute 56 seconds |
VM.GPU.A10.2 |
Two A10 GPUs |
6 |
1 minute 7 seconds |
Performance improvement with GPU-based rendering:
GPU shape |
Rendering performance improvement over E4.Flex with 16 OCPU |
Rendering performance improvement over E4.Flex with 32 OCPU |
---|---|---|
VM.GPU.A10.1 (1 frame) |
91.5% |
83.2% |
VM.GPU.A10.2 (1 frame) |
95% |
.90.1% |
VM.GPU.A10.1 (6 frames) |
91.5% |
83.2% |
VM.GPU.A10.2 (6 frames) |
95% |
90.3% |
We hope that these benchmarks help you appreciate the performance, speed, and price benefits offered by the NVIDIA A10-based GPU shapes on OCI for your graphics rendering, animation, video, and ML workloads and solutions. Try Oracle Cloud Infrastructure yourself for free and explore the capabilities.
For more information, see the following resources:
OCI GPU VM shapes (Documentation)
OCI GPU bare metal shapes (Documentation)
Previous Post