X

The latest cloud infrastructure announcements, technical solutions, and enterprise cloud insights.

Oracle Cloud Infrastructure Supports COVID-19 Researchers with NVIDIA Parabricks

Andrew Butterfield, and Kristen Yang

Genome-sequencing software for GPUs running on Oracle Cloud Infrastructure

Oracle Cloud Infrastructure is working with NVIDIA to accelerate researchers’ understanding of the COVID-19 virus. Oracle Cloud Infrastructure has released an updated NVIDIA GPU Cloud Machine Image that enables researchers to use NVIDIA’s Parabricks software suite to perform genome-sequencing of the virus. This is in conjunction with NVIDIA’s announced free 90-day license of Parabricks to any researcher in the fight against COVID-19.

Updated NVIDIA GPU Cloud Machine Image

The NVIDIA GPU Cloud Machine Image on Oracle Cloud Infrastructure contains GPU-optimized software from NVIDIA GPU Cloud (NGC) and gives researchers, data scientists, and developers an environment to help them gain insights, build business value, and develop new solutions. This image takes advantage of the cutting-edge cloud computing infrastructure that Oracle has built. You can run the NVIDIA GPU Cloud Machine Image across the full portfolio of GPU options on Oracle Cloud, ranging from virtual machines with one or more P100 or V100 GPU chips to the cutting-edge, bare metal GPU instance with eight V100 chips attached.

Parabricks

Researchers who want to test possible outcomes and responses related to to this new and largely unknown virus can use Parabricks software, which was designed to perform next-generation sequencing of DNA data. Parabricks can analyze whole human genomes in about 45 minutes, compared to about 30 hours for 30x WGS data. If you’re a researcher working on COVID-19, you just need to fill out a form to request a Parabricks license.  We're excited for our cloud resources to be a small part of our society's response to this pandemic that is affecting all of us, and hope to see progress in research that benefits our world in the face of this challenge.

Getting Started

Follow these steps to launch a Compute instance with Parabricks software.

Step 1: Create a Virtual Cloud Network

  1. Log in to your tenancy in the Oracle Cloud Infrastructure Console.

  2. Select a region where GPUs are available, for example, Germany Central (Frankfurt).

  3. In the main menu, navigate to Networking and then Virtual Cloud Networks.

  4. Click Networking Quickstart, then VCN with Internet Connectivity, and then Start Workflow.

  5. Enter network details for the VCN, and then click Create.

Step 2: Launch an Instance by Using the NVIDIA GPU Cloud Machine Image

  1. Go to the Oracle Cloud Marketplace page for the image.

  2. Click Get App in the upper-right corner of the page.

    Screenshot of the Get App button.

  3. Select the Oracle Cloud Infrastructure (OCI) region for the Compute launch (the same region where you just created a VCN) and click Sign In.

    Screenshot that shows the region selector and the Sign In button.

  4. Select the latest image version (for example, 20200219-default) and select your compartment. Then, click Launch Instance.

    Screenshot that shows the version and compartment selectors and the Launch Instance button.

  5. Create a Compute instance with the following details:

    • Image Source: Select NVIDIA GPU Cloud Machine Image.
    • Availability Domain: Select an availability domain that has GPUs, for example, EU-FRANKFURT-1-AD-3.
    • Instance Type: Choose Virtual Machine or Bare Metal.
    • Instance Shape: Choose one of the following shapes:
      • VM.GPU2.1
      • BM.GPU2.2
      • VM.GPU3.1
      • VM.GPU3.2
      • VM.GPU3.4
      • BM.GPU3.8
    • Configure Networking: Select the correct compartment and the recently created VCN and subnets.
    • Add SSH Keys: Paste in your SSH public key.
  6. After the instance is provisioned, use SSH to access it by running the following command:

    ssh ubuntu@<ip address> -i <private key name>

Step 3: Verify the Required Libraries for Parabricks

Our NGC image version 20200219 comes with the following libraries installed, which meets the basic Parabricks requirements:

Parabricks Requirement

Offering

Pascal (sm60), Volta (sm70), or Turing (sm75) NVIDIA GPUs

VM.GPU2.1 (1 Pascal sm60 GPU)

VM.GPU3.1 (1 Volta sm70 GPU)

VM.GPU3.2 (2 Volta sm70 GPUs)

VM.GPU3.4 (4 Volta sm70 GPUs)

BM.GPU2.2 (2 Pascal sm60 GPUs)

BM.GPU3.8 (8 Volta sm70 GPUs)

CUDA driver version >= 384.81

Cuda Version 9.2.148

Linux x86_64 driver Version  >= 396.37

Python >= 2.7

Python 2.7.12

NVIDIA Docker

NVIDIA Docker version 18.09.4, build d14af54


Step 4: Add Block Storage for Additional Space to Run Parabricks

  1. In the navigation menu, select Block Storage and then Block Volumes.

  2. Create a volume with the following values:

    • Create in Compartment: Select your compartment.
    • Availability Domain: Select the same availability domain as your instance.
    • Size: Select your block volume size, for example, 1000 GB.
    • Backup Policy: Select a backup policy, for example, Bronze.
    • Encryption: Encrypt using Oracle managed keys.
  3. In the navigation menu, select Compute and then Instances.

  4. Select the GPU instance that you just created.

  5. Under Resources, click Attached Block Volumes.

  6. Select iSCSI as the attachment type.

  7. For Access, select Read/Write.

  8. Choose Select Volume, select your compartment, and then select the block volume that you just created.

  9. Select a device path, for example, /dev/oracleoci/oraclevdb.

  10. Click Attach.

  11. After the volume is attached, click the Actions menu (three dots) next to the volume, and then click iSCSI Commands and Information. Then, run the attach commands, one by one.

  12. To list the attached devices, run lsblk .

    The block volume should appear as attached.

  13. Create a partition:

    sudo fdisk -l <device path>

    For example: sudo fdisk -l /dev/oracleoci/oraclevdb

  14. Create an NFS fileshare system:

    sudo mkfs -t ext4 <device path>
  15. Create a folder on the /mnt drive to mount the block volume. For example:

    sudo mkdir /mnt/parabricks
  16. Create a mount point:

    sudo mount <device path> <mount point>

    For example: sudo mount /dev/oracleoci/oraclevdb /mnt/parabricks

  17. Run lsblk.

    The disk should appear as correctly mounted on the mountpoint.

  18. Change the permissions of the volume. For example:

    sudo chmod 777 parabricks

Step 5: Configure Parabricks

To configure Parabricks, follow these steps and refer to the NVIDIA instructions.

  1. Request access to the Parabricks installer Python file for a 90-day trial.

  2. Download the Parabricks installer file into your mounted share drive on the instance.

  3. Untar parabricks.tar.gz:

    tar -xzf parabricks.tar.gz
  4. Run the installer:

    sudo ./parabricks/installer.py
  5. Verify your pbrun version:

    sudo pbrun version
  6. Download a sample dataset to run a benchmark and untar it:

    wget https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz?Expires=1613069864&Signature=WxLeyitbvR%2B0rO4MX%2B0GohDw89g%3D&AWSAccessKeyId=AKIAJGDUNN2G2ZAH3Q3A
    tar -xvzf parabricks_sample.tar.gz
  7. Run the following command:

    sudo pbrun fq2bam --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz --out-bam output.bam

Resources

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.