Generative AI is taking the world by storm, from large language models (LLMs) like generative pretrained transformer (GPT) models to diffusion models. Powered by NVIDIA technology, Oracle Cloud Infrastructure (OCI) is uniquely positioned to accelerate generative AI workloads, and those for data processing, analytics, high-performance computing (HPC), quantitative financial applications, and more. It’s a one-stop solution for diverse workload needs as we increasingly see a convergence of HPC quantitative finance and AI requirements from end users.
In this blog post, we begin by focusing on the quantitative applications setting new records on OCI with NVIDIA GPUs. In financial risk management applications, for example, OCI powered by NVIDIA GPUs offers incredible speed with great efficiency and cost savings. NVIDIA A100 Tensor Core GPUs were featured in a stack that set several records in a recent STAC-A2™ audit with 8 x NVIDIA A100 GPUs in an Oracle Cloud BM.GPU4.8 Instance (SUT ID NVDA231026). The system was independently audited by the Strategic Technology Analysis Center (STAC®). STAC and all STAC names are trademarks or registered trademarks of the Strategic Technology Analysis Center.
STAC-A2 is the technology benchmark standard based on financial market risk analysis. Designed by quants and technologists from some of the world’s largest banks, STAC-A2 reports the performance, scaling, quality, and resource efficiency of any technology stack that can handle the workload. The benchmark is a Monte Carlo estimation of Heston-based Greeks for a path-dependent, multi-asset option with early exercise. The workload can be a proxy extended to price discovery, market risk calculations such as sensitivity Greeks, profit and loss, and value at risk (VaR) in market risk and counterparty credit risk (CCR) workloads, such as credit valuation adjustment (CVA) and margins that financial institutions calculate for trading and risk management.
The STAC-A2 tests were performed using an NVIDIA-authored STAC Pack on OCI hardware. In all, the STAC-A2 specifications delivered over 200 test results which are summarized in the STAC Report. OCI wants to draw attention to the following points:
The financial industry’s key concerns have been pricing and risk calculation, which rely heavily on the latest technologies for instantaneous calculations and real-time decision-making for trading. Pricing and risk calculation, algorithmic trading model development, and backtesting need a robust scalable environment with the fastest interconnects and networking.
The OCI Compute GPU instance, BM.GPU4.8, and other instances based on NVIDIA Ampere and Hopper architectures provide that solution, so that these workloads can run standalone or scaled per end user, such as for Market Risk VaR calculations or CCR calculations, such as CVA. In areas like CVA, scaled setups have been shown to reduce the number of nodes from 100 to 4 in simulation- and compute-intensive calculations, separately from STAC benchmarking.
The OCI-based solution enables scaling up with NVIDIA GPUs using fewer nodes. It enables the highest performance at the lowest operating cost with the ease of use of adopting cutting-edge hardware for solutions on the cloud. The solutions can extend to other workloads, such as AI, quantitative modeling through various techniques like traditional quantitative models, machine learning such as XGBOOST, and deep learning, such as long short-term memory (LSTM), recurrent neural networks (RNNs), and other advanced areas. These models must be backtested on various ticker symbols for different products, so they need a flexible cloud infrastructure, such as OCI Compute with NVIDIA GPU instances.
NVIDIA provides all the key software component layers. NVIDIA offers multiple options to developers, including the NVIDIA CUDA software development kit (SDK) for CUDA and C++, and enables other languages and directive-based solutions, such as OpenMP, OpenACC, accelerations with C++ 17 standard parallelism, and Fortran parallel constructs with the NVIDIA HPC Software Developer Kit (SDK).
The implementation used for STAC-A2 was developed on CUDA 12.0 and uses the highly optimized libraries delivered with CUDA: cuBLAS, the GPU-enabled implementation of the linear algebra package BLAS, and cuRAND, a parallel and efficient GPU implementation of random number generators. The STAC Pack used the CUDA Toolkit-12.2 that includes NVCC12.2.91, associated CUDA libraries, and GCC 11.2.1.
The different components of the implementation were designed in a modular and maintainable framework using object-oriented programming. All floating-point operations were conducted in IEEE-754 double precision (64 bits). The STAC-A2 implementation was developed using tools that NVIDIA provides to help debug and profile CUDA code. These tools include NVIDIA Nsight Systems for timeline profiling, NVIDIA Nsight Compute for kernel profiling, and NVIDIA Compute Sanitizer and CUDA-GDB for debugging.
The convergence of HPC and AI is happening as financial firms, including global market banks, insurers, hedge funds, market-makers, high-frequency traders, and asset managers, work on big-picture solutions. These combine various modeling techniques, including HPC quantitative finance, machine learning (ML), reinforcement learning (RL) and AI neural nets, and natural language processing (NLP) generative AI with LLMs.
Organizations can use LLMs on unstructured sources of information, such as financial news, and techniques such as retrieval augmented generation (RAG) to gain an information edge that's beyond traditional sources of tabular market data, called “alternative data.” Organizations are converging NLP with generative AI, creating new signals, and feeding inputs into quantitative calculations. Enterprise customers can benefit from customizing such AI LLM models to understand the financial domain better and meet their individual needs with greater accuracy by leveraging a combination of AI and quantitative financial models in their workflows. In addition, signals generated by such models, along with trading risk pricing and calculations, are performed on a real-time basis and repeated multiple times to backtest the models for ongoing monitoring based on market conditions.
Powered by NVIDIA technology, Oracle Cloud Infrastructure is uniquely positioned to accelerate workloads ranging from HPC quantitative financial applications and data processing to analytics and generative AI, providing maximum value and return on investment (ROI) and reducing total cost of ownership (TCO) for customers looking to integrate diverse workloads into their financial applications.
Amarendra Joshi is a Director in North America Cloud Engineering. His team helps customers leverage Oracle Cloud Infrastructure for their cloud computing needs.
HPC/AI Enablement and Performance, GPU and Parallel Computing Solutions.
Florent is principal Developer Technology Engineer (Devtech) at NVIDIA. He graduated with a PhD in applied mathematics at INRIA in 2005. After graduation, he consulted for several financial institutions, mainly investment banks in electronic markets and then quantitative research teams. An early adopter of CUDA, he enabled several institutions with GPU computing, in Finance, Insurance and Oil and Gas. Today, Florent is working on optimizing CUDA implementations for quantitative finance and energy simulations.
Prabhu Ramamoorthy is the financial ecosystem partner manager at NVIDIA, where he focuses on quant and AI acceleration for financial services. Previously, he was head of technology at the margin software firm Dash Regtech, catering to leading investment banks. He also served as a director at KPMG/EY, where he helped 100+ financial institutions over the last 10 years. Ramamoorthy holds an MBA from the University of Wisconsin-Madison and an undergraduate degree in Engineering from BITS-Pilani, one of the top engineering institutes in India. He is a CFA charterholder, financial risk manager, and chartered alternative investment analyst specializing in financial transformation use cases.