Genome-sequencing software for GPUs running on Oracle Cloud Infrastructure
Oracle Cloud Infrastructure is working with NVIDIA to accelerate researchers’ understanding of the COVID-19 virus. Oracle Cloud Infrastructure has released an updated NVIDIA GPU Cloud Machine Image that enables researchers to use NVIDIA’s Parabricks software suite to perform genome-sequencing of the virus. This is in conjunction with NVIDIA’s announced free 90-day license of Parabricks to any researcher in the fight against COVID-19.
The NVIDIA GPU Cloud Machine Image on Oracle Cloud Infrastructure contains GPU-optimized software from NVIDIA GPU Cloud (NGC) and gives researchers, data scientists, and developers an environment to help them gain insights, build business value, and develop new solutions. This image takes advantage of the cutting-edge cloud computing infrastructure that Oracle has built. You can run the NVIDIA GPU Cloud Machine Image across the full portfolio of GPU options on Oracle Cloud, ranging from virtual machines with one or more P100 or V100 GPU chips to the cutting-edge, bare metal GPU instance with eight V100 chips attached.
Researchers who want to test possible outcomes and responses related to to this new and largely unknown virus can use Parabricks software, which was designed to perform next-generation sequencing of DNA data. Parabricks can analyze whole human genomes in about 45 minutes, compared to about 30 hours for 30x WGS data. If you’re a researcher working on COVID-19, you just need to fill out a form to request a Parabricks license. We're excited for our cloud resources to be a small part of our society's response to this pandemic that is affecting all of us, and hope to see progress in research that benefits our world in the face of this challenge.
Follow these steps to launch a Compute instance with Parabricks software.
Log in to your tenancy in the Oracle Cloud Infrastructure Console.
Select a region where GPUs are available, for example, Germany Central (Frankfurt).
In the main menu, navigate to Networking and then Virtual Cloud Networks.
Click Networking Quickstart, then VCN with Internet Connectivity, and then Start Workflow.
Enter network details for the VCN, and then click Create.
Go to the Oracle Cloud Marketplace page for the image.
Click Get App in the upper-right corner of the page.
Select the Oracle Cloud Infrastructure (OCI) region for the Compute launch (the same region where you just created a VCN) and click Sign In.
Select the latest image version (for example, 20200219-default) and select your compartment. Then, click Launch Instance.
Create a Compute instance with the following details:
After the instance is provisioned, use SSH to access it by running the following command:
ssh ubuntu@<ip address> -i <private key name>
Our NGC image version 20200219 comes with the following libraries installed, which meets the basic Parabricks requirements:
Pascal (sm60), Volta (sm70), or Turing (sm75) NVIDIA GPUs
VM.GPU2.1 (1 Pascal sm60 GPU)
VM.GPU3.1 (1 Volta sm70 GPU)
VM.GPU3.2 (2 Volta sm70 GPUs)
VM.GPU3.4 (4 Volta sm70 GPUs)
BM.GPU2.2 (2 Pascal sm60 GPUs)
BM.GPU3.8 (8 Volta sm70 GPUs)
CUDA driver version >= 384.81
Cuda Version 9.2.148
Linux x86_64 driver Version >= 396.37
Python >= 2.7
NVIDIA Docker version 18.09.4, build d14af54
In the navigation menu, select Block Storage and then Block Volumes.
Create a volume with the following values:
In the navigation menu, select Compute and then Instances.
Select the GPU instance that you just created.
Under Resources, click Attached Block Volumes.
Select iSCSI as the attachment type.
For Access, select Read/Write.
Choose Select Volume, select your compartment, and then select the block volume that you just created.
Select a device path, for example, /dev/oracleoci/oraclevdb.
After the volume is attached, click the Actions menu (three dots) next to the volume, and then click iSCSI Commands and Information. Then, run the attach commands, one by one.
To list the attached devices, run lsblk .
The block volume should appear as attached.
Create a partition:
sudo fdisk -l <device path>
For example: sudo fdisk -l /dev/oracleoci/oraclevdb
Create an NFS fileshare system:
sudo mkfs -t ext4 <device path>
Create a folder on the /mnt drive to mount the block volume. For example:
sudo mkdir /mnt/parabricks
Create a mount point:
sudo mount <device path> <mount point>
For example: sudo mount /dev/oracleoci/oraclevdb /mnt/parabricks
The disk should appear as correctly mounted on the mountpoint.
Change the permissions of the volume. For example:
sudo chmod 777 parabricks
To configure Parabricks, follow these steps and refer to the NVIDIA instructions.
Request access to the Parabricks installer Python file for a 90-day trial.
Download the Parabricks installer file into your mounted share drive on the instance.
tar -xzf parabricks.tar.gz
Run the installer:
Verify your pbrun version:
sudo pbrun version
Download a sample dataset to run a benchmark and untar it:
wget https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz?Expires=1613069864&Signature=WxLeyitbvR%2B0rO4MX%2B0GohDw89g%3D&AWSAccessKeyId=AKIAJGDUNN2G2ZAH3Q3A tar -xvzf parabricks_sample.tar.gz
Run the following command:
sudo pbrun fq2bam --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz --out-bam output.bam