(Originally published on Forbes)
Penn State chemistry professor Edward O’Brien is doing the kind of work long associated with supercomputing, creating models of how our bodies make proteins, trying to learn what causes mutations that make us sick.
High-performance computing, however, is becoming essential to many more fields of study at the university. Witness Zita Oravecz’s work in quantitative psychology.
Oravecz, a professor of human development and family studies, is combining real-time physical data, provided via fitness trackers on study participants, with data they enter into an app about their emotions and stress levels. The goal is to create models that grasp the connections between our bodies, minds, and emotions—perhaps leading to an app that could predict your emotional state and intervene in real time if needed.
This ever-rising demand for data-driven research has led Penn State’s Institute for CyberScience (ICS) to turn to cloud-based infrastructure to handle big spikes in computing demand and provide more overall capacity. ICS’s mission is to support Penn State researchers’ high-performance computing needs across a huge span, from weather patterns on Mars to Twitter streams in Tunisia.
A Mars weather simulation, for instance, “needs to run a simulation on tens of thousands of cores at one fell swoop, but on a pretty short notice,” says Chuck Gilbert, ICS technical director and chief architect of the institute’s Advanced CyberInfrastructure (ICS-ACI). “To have all that computational capacity on site and sitting idle at all times waiting for these types of requests is not a good use of university resources.”
ICS has a “co-hire” program, where Penn State faculty members such as Oravecz and O’Brien are affiliated with and supported by both ICS and their home departments. The goal is to help experts work across disciplines, jump-starting interdisciplinary research that can lead to innovative solutions. What ICS does not want is for those experts to spend a lot of time worrying about how to get the computing power they need to pursue their ideas.
By early next year, ICS intends to have its ICS-ACI systems linked to a small handful of major cloud providers, including Oracle Cloud Infrastructure, so that Gilbert’s team can easily move workloads. “We're literally extending our data center footprint out to the cloud,” Gilbert says.
Different Clouds for Different Purposes
Not all public clouds are the same, and Gilbert’s team will direct workloads to the computing source—on premises or the cloud—that makes the most sense, he says.
Oracle Cloud Infrastructure, for example, offers the option to run workloads on so-called “bare metal”—directly on a server so there is no hypervisor software that can slow down certain computing tasks.
A bare metal option might be needed, for example, in a mechanical engineering simulation that relies on direct access to a microprocessor’s built-in cache. ICS might tap bare metal cloud capacity for simulations that need very low-latency communications and that can’t tolerate any chance of hitting an over-subscribed, shared server.
“Really, we're talking about nanoseconds and microseconds, and that little bit of jitter can make all the difference in a simulation being correct or not correct,” Gilbert says.
At the same time it adds cloud resources to ICS-ACI, ICS will expand the system from 23,000 physical CPUs today to 48,000 CPUs. The on-premises high-performance system now deploys six to eight petabytes of active storage and another 12 petabytes of archival storage.
Another reason ICS might move a workload to the cloud, Gilbert says, is for better security—to achieve higher-level certifications, such as the federal government’s FedRAMP. Some federal grants require researchers to use FedRAMP-certified data centers.
“We're able to utilize a cloud provider that has an even broader set of resources to bring to bear on security and certifications, and then just push those types of sensitive workloads out to that preapproved provider,” Gilbert says.
A DevOps Strategy for Cloud Bursting
Gilbert’s team has tested public cloud infrastructures against standard high-performance computing benchmarks, and now it’s testing different clouds based on real ICS research use cases. A lot of the work is around standardization of software-building and deployment methodologies, so researchers have the same experience accessing computing power whether it’s in-house or in a public cloud.
“It's really solid DevOps principles that are being applied,” he says.
Cloud infrastructure adoption varies widely across higher education institutions, but more researchers are exploring it to get the nimble computing power they need, says Brent Seaman, vice president of cloud solutions with Mythics, a consultancy that has worked with ICS and other universities on cloud projects. Say a professor heads down a research avenue that requires a lot of computing power to prove or disprove, and she wants to quickly assess the idea’s potential without investing a lot in hardware, in case it doesn’t pan out. “Certain projects might come and go that don’t correspond to the university budget cycle, and researchers need to be able to respond,” Seaman says.
The ultimate goal is to allow researchers to spend almost no time thinking about computing resources and most of their time thinking about their results, models, and theories. At ICS, initially, Gilbert’s team and the researchers will decide whether a job should be placed with a cloud provider, but in the future, that process will be automated, as the ICS-ACI platform picks the right cloud or in-house resource for the job.
“As we build that extension of our data center off to the cloud,” Gilbert say, “there technically isn't a peak capacity any longer.”