We Provide the Cloud. You Change the World.

GPU Benchmarking for Drug Cardiotoxicity Prediction on Oracle Cloud

Guest Author

Written by:

Igor Vorobyov, PhD, Assistant Professor, Department of Physiology and Membrane Biology, University of California, Davis

Colleen E. Clancy, PhD, Professor, Department of Physiology and Membrane Biology, University of California Davis

Rajib Ghosh, Global Senior Solutions Architect, Oracle for Research

To kick-off our Voice of the Researcher blog series, we are delighted to introduce guest authors Igor Vorobyov and Colleen E. Clancy with the University of California, Davis who discuss their research project to predict drug induced cardiotoxicities using Oracle Cloud.

Research project 

Drug cardiotoxicity describes when a drug to cure one ailment also does harm to a patient’s heart and can lead to potentially deadly irregular heartbeats - cardiac arrhythmias. It is a serious and expensive problem where nearly 10% of drugs in the past several decades have been pulled from the clinical market due to cardiovascular concerns. Moreover, up to 50-70% of drug candidates are eliminated early in the development process due to potential of causing cardiac arrhythmias. The existing guidelines for drug cardiotoxicity risk assessment are not selective and can lead to the abandonment of safe and effective medications. To truly assess cardiotoxicity risk, drugs must be also tested in the context of comorbidities: what will this drug do to a patient who already has complicating factors that increase his or her risk of heart problems. The focus of the research project in the Clancy and Vorobyov laboratories at the University of California, Davis (UC Davis) is to use advanced molecular dynamics (MD) simulations to develop AI-driven in silico multi-scale functional models that predict drug induced cardiotoxicities from chemical structures. Oracle Cloud enterprise scale computing, including high performance bare metal CPU and GPU shapes that can be used in combination with GPU enabled software tools, accelerates the results by running simulations in a faster and more efficient manner. It is illustrated here on an example of all-atom MD simulations of sotalol, a drug with a high pro-arrhythmia risk, interacting with a cardiac membrane protein beta-1 adrenergic receptor embedded in a hydrated lipid membrane. You can learn more about the UC Davis project here and here.


Nanoscale molecular dynamics (NAMD) is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program visual molecular dynamics (VMD) for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR molecular modeling packages. NAMD is available free of charge with source code and pre-compiled binaries and is a prime choice for research intensive computations in the MD simulation space.

NAMD and VMD are also used by the team who won the 2020 ACM Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research award with their paper AI-Driven Multiscale Simulations Illuminate Mechanisms of SARS-CoV-2 Spike Dynamics, presented on November 18, 2020 at Supercomputing 2020. The prize recognizes the outstanding achievement toward the understanding of the COVID-19 pandemic through the use of high performance computing.

NAMD 3.0a (alpha) has a single-node, single-GPU, GPU-resident computation mode that can speed up simulations more than 2x times in GPU architectures like NVIDIA Volta or Oracle BM.GPU3.8 shapes. NAMD 3.0a is one of the first CUDA (Compute Unified Device Architecture) accelerated applications whose simulations are geared to maximize performance for medium size molecular dynamics simulations (in the range of 10K to 1M atoms). This range is suitable for computational capabilities of a GPU accelerated compute node using a GPU specific code path that offloads the integrator and rigid bond constraints to the GPU thus bypassing the associated overheads associated with CPU activities that slow down the simulations. At present, only single GPU offloading scheme is available, and multi-GPU acceleration of a single-replicate is part of the ongoing development work. 

Molecular dynamics simulation algorithm 

Molecular dynamics (MD) simulation usually employs a time-stepping algorithm to propagate molecular interactions in time. The timestep algorithms are performed in a nanosecond-scale simulation millions of times to evaluate the force and energy for interatomic interactions. This force calculation comprises about 99% of the computation workload. One of the key steps of this timestep process is numerical integration where calculated forces are applied to atomic positions and velocities but it is a low overall computation activity that occupies about 1% of the overall FLOPS of the time step process. This is a CPU computation step and results in a GPU idle time in the overall timestep cycle. With millions of timestep simulations, moving this numerical integration step to the GPU computation cycle removes this GPU idle bottleneck in the overall timestep process and allows the algorithm to execute simulations quickly.
Molecular dynamics simulation packages that leverage this can perform much faster on high tensor core NVIDIA Volta / Turing architecture such as Oracle BM.GPU3.8 or BM.GPU4.8 instances. Even though GPU optimizations are geared for single-GPU use, they can also benefit from multi-GPU core use-cases such as multi-replica simulations. This results in better overall performance with reduced CPU core per simulation, which is a bottleneck for CPU bound multi-copy simulations like NAMD 2.14.

Research use-case

The simulations were carried out for the research use-case of beta-1 adrenergic receptor - cationic l-sotalol complex embedded in a 1-Palmitoyl-2-oleoylphosphatidylcholine / 1-Palmitoyl-2-oleoylphosphatidylserine (POPC/POPS) mixed lipid bilayer and soaked by 0.15 M NaCl aqueous solution with the total system size of 244,187 atoms. All-atom CHARMM36m protein and C36 lipid force field, recently developed sotalol as well as standard CHARMM ion parameters and TIP3P water were used. The simulations ran in the NPT ensemble at 310 K and 1 atm pressure using a standard 2.0 fs (2.0 x 10–15 s) time step, constraints for bonds involving H atoms using SHAKE and SETTLE (for water) algorithms, standard non-bonded interaction cutoffs of 12 Å tapered at 10 Å with a force-switching function , particle mesh Ewald (PME) scheme for long-range electrostatics, tetragonal periodic boundary conditions, Langevin dynamics and piston for temperature and pressure controls, respectively.  For this computational use-case we used CUDA enabled multicore NAMD 2.14 or 2.14b1 and NAMD 3.0 alpha with Oracle cloud Virtual and Bare metal GPU instances (GPU2.x (Pascal) and GPU3.x(Volta)) architectures. The benchmarks are performed using 10 ps or 10–11 s long (corresponding to 5,000 MD steps) NAMD runs. Different CPU core affinity options were tested, and the ones providing the best performance were selected.

To run an MD simulation on a computer cluster, the user has to specify numerous parameters to control the behavior of the underlying hardware and software. Moreover, optimal parameters might vary between different versions of the same MD simulation engine and depend on the details of the molecular system under consideration. The simulation performance, usually measured in simulated time per day, e.g., ns/day is plotted against the Oracle Cloud Virtual and Bare metal shapes.

The benchmark illustrated in Figure 1 shows: 

  1. GPU accelerated and resident MD simulation software (i.e., NAMD 3.0a) results in a 2.5 times performance boost as compared to the CPU bound versions (e.g., NAMD 2.14)
  2. Performance gains have scaled at a much higher rate on GPU enabled and resident software (NAMD 3.0a) that takes advantage of GPU3.x shapes with NVIDIA Volta architecture
  3. Potential higher performance gains can be achieved with clustering of these shapes

Figure 1

An offshoot of the benchmark analysis, detailed in Figure 2, is the MD simulation cost expressed in USD/ns. This shows: 

  1. GPU accelerated and resident NAMD 3.0 alpha execution costs are about 2-2.5 times less than its CPU bound counterpart NAMD 2.14
  2. NAMD 3.0 alpha showed linear cost scaling with increase in GPU cores and is independent of virtual and bare metal shapes showing minimal or no effect of hypervisor overhead on the computational workloads
  3. Costs scale up at a lower rate with computational load increase for GPU accelerated and resident NAMD 3.0 alpha

Figure 2

The expected results of the above benchmarks are in alignment with NVIDIA and Oracle published benchmarks as shown in Figures 3 and 4 below.

Figure 3

Figure 4

The results outlined in the figures above also show alignments with NAMD 3.0 alpha and NVIDIA GPU benchmarks.  Apolipoprotein A I (ApoA1) and Satellite Tobacco Mosaic Virus (STMV) systems contain 92,224 and 1,066,628 atoms, respectively.  Further details on the published standard benchmarks can be found in the NAMD performance and NVIDIA MD blogs.  

To learn more how Oracle is shaping critical research with access to Oracle Cloud, technical mentorship and a nurturing community visit Oracle for Research Technology Talks. Ready to explore how Oracle for Research can help you accelerate your research-driven discoveries? Contact us!

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.