Tom Haunert, Oracle Magazine editor in chief, recently sat down with Scott Tracy, senior director of storage software development at Oracle, for an in-depth discussion about Oracle’s flash, disk, tiered storage, and unified storage technologies.
The following is an excerpt from that interview. Download the full podcast at oracle.com/magcasts.
Oracle Magazine: Beyond the basic demands of storage, what are the main challenges organizations are facing today with the information explosion and the storage required to manage it?
Tracy: There are a number of things IT shops are facing. Certainly, data management and migration activities: as platforms get older, they get slower, or they run out of service life. The act of replacing that storage takes a substantial amount of time and activity. Then there are the cost constraints people are under today: the number of personnel required to actually manage the IT shops, and obviously the cost of the storage devices themselves, as well as servers for those devices.
Problem-resolution time is a big deal too. As you get more and more storage in your datacenter—more and more machines—problems become more complicated. They take longer to figure out, and IT shops are looking for ways to get around that. Then there’s data protection and recovery—just making sure that you’ve got adequate copies and backups of data, and that those copies and backups can be restored in case of disaster, in a recovery time that makes sense for your business.
Finally, there’s security. Every day or every week, it seems like some business has been breached. And so, whether it’s security in the form of encryption on devices so that nobody can peek in to see the data, or it’s just physical security where people just can’t get to the backbone of your datacenter, everybody has some implementation of security. We need to keep that in mind as we’re designing new storage.
Oracle Magazine: What are some of the key strategies in dealing with these storage challenges?
Tracy: Certainly consolidation and reduction —the act of taking data from multiple older storage devices and putting it into singular newer storage devices is a hardware solution. From the software perspective, there are newer technologies like compression, where you can squeeze out a 2-to-1 reduction of data, and deduplication, where you can get up to a 10-to-1 reduction of data.
Another interesting technology is flash. It represents a new tier of storage, and my thought as an engineer had been that everybody managing their datacenters would say, “Hooray, this is great—another tier that’s faster!” In fact, people in the IT shops are saying, “Yes, great. It’s a faster piece of storage, but now I have to go manage yet another tier.” So the idea of autotiering between flash, disk, and tape is something that IT managers are looking for. If you’re trying to take data and manually place it in either flash or disk, that’s very hard to do, and you’ll ultimately get it wrong. You need software to help manage the transition.
The reason this is important is, of course, that flash offers higher performance. It’s also a little bit higher in cost. Disk is somewhere in the middle, both in terms of cost and performance. There are shades of gray here where high-performance disks perform better and are higher cost; higher-capacity disks are a little bit lower in cost but a little bit slower in performance; and then tape, which is the lowest in terms of cost but also the lowest in terms of performance. To manage your datacenter effectively across the cost bandwidth, you really have to use all these tiers.
Oracle Magazine: What do organizations need to consider when including flash storage in their tiered storage strategies?
Tracy: The introduction of flash as a separate tier of storage is really there to speed up response time for both reads and writes. The strategy of flash on the read side is to make sure the data needed is already cached. There’s a bunch of algorithms inside many Oracle storage boxes, namely the Sun ZFS Storage Appliance, that actually try to understand data patterns and prefetch information from disk and store it in flash. So when you ask for it, it’s already cached, and you get a very quick read back to your application.
The strategy of flash on writes is to temporarily store data, return that status, and then subsequently write that data permanently to disk to store the information. So you get the quick response time of, “Hey, I’ve written this data.” It looks like you’ve written it to disk, with a background process that actually does the writing.
The thing about flash—I mean it even sounds fast, flash—is it’s an order of magnitude faster than disk in this response time. Newer innovations that use a faster storage type—called DRAM [dynamic random access memory]—actually reduce response time by a factor of 10 again. These strategies are starting to come into play today. They also lower the total cost of ownership. From an electricity standpoint, they use significantly less power. The footprint is much smaller, and they have the storage power of many, many disks.
The Sun Storage F5100 Flash Array, for example, is a 2U-form factor that supports 2 terabytes of flash and performs the same as 3,000 disks, but uses 0.15 percent of the space and about 0.55 percent of the power—a significant reduction.
Oracle Magazine: Tell us about the hardware and software that make up the Sun ZFS Storage Appliance and unified storage.
Tracy: The Sun ZFS Storage Appliance is made up of a bunch of disk drives. Obviously we have both single and redundant controllers. The controllers have DRAM. They also have a read cache in them. Then, out in the storage fabric, we have what we call write flash.
One of the technologies we have in the Sun ZFS Storage Appliance is autotiering, which I’ve mentioned. We actually have our own term for it called hybrid storage pools. And really, again, the data is intelligently and automatically migrated between memory, flash, and disk. Hybrid storage pools are continuously optimizing the storage system for performance and efficiency, and they manage the system as a single storage pool through a simple, transparent single hierarchy.
That describes what’s in the storage appliance, but that’s not unified storage. Unified storage is defined by the software that enables customers to use it, and it falls into three different buckets. One of those buckets is data protocols. If you look at our storage box, we support both file and block protocols, 10 of them in one box. In the next bucket, we also support data services that customers will require, both for data protection as well as compression of data, and so on. And in the last bucket, from a management perspective, we have very unique analytics, which include dynamic real-time visualization of application- and storage-related workloads with very simple but sophisticated instrumentation that provides real-time, comprehensive analysis. This allows you to troubleshoot and know what’s going on inside your storage box, as well as outside on a server fabric.
Oracle Magazine: Looking forward, what do you see as the next set of challenges and strategies for storage?
Tracy: We started out talking about the explosion of data and the requirements on storage. To keep up with that explosion, storage boxes are going to have to get bigger, faster, and easier to manage, and they’re going to have to start cooperating together.
One thing everybody in the industry is attacking is the idea of data protection. As the data and hard-disk drives get bigger, when a failure occurs, the ability to replicate and give that data back becomes harder.
For example, you can use a technology called mirroring the data, where you write to two disks. That’s actually a pretty good technology solution as disk drives get bigger and bigger. You simply remove the disk that’s failed, put in a new one, and it will copy the data from one side to the other. While you’re exposed to a single point of failure for a period of time, you still have access to the data. The only issue with that solution is that you have to buy two times the amount of disk drives, and that becomes cost prohibitive.
RAID-5 and RAID-6 don’t require two times the disk drives. The big issue there is rebuilding those disks as the disk drives get bigger and bigger. There are strategies and technologies for how to do that rebuilding process faster, and we’re developing those today.
Finally, there’s manageability. As you’re going to have more and more data, you’ve got to have better management because you’re not getting increases in budget to increase your staff. In fact, you’re getting staff reductions. So we as storage vendors have to figure out better and better ways to manage those storage boxes.
LISTEN to the podcast.
Photography by rawpixel.com,Unsplash