It has been very quiet for the last couple of months. Main reason is the workload that has increased quite noticable. However, having a blog post every now and then is a good habit I intend to continue. In one of my last posts I've touched the idea of VDI Image Layering and some pros and cons. One aspect of it was application streaming for VDI. The idea of using a core image/template that is applicable for the vast majority of the users. And On top of this core image there is then a mechanism to entitle applications to users. These applications are streamed on demand based on the individual user needs.
This idea resonates with many IT operations as it promises to limit the management activities of the core image(s) on the one hand while on the other hand giving hugh flexibility to provision the application to users that are needed by them. And not just everything anybody in the corporation might need eventually.
In reality there are a number of downsides to this approach. First of, application streaming does not work with each and every application. It works fine with an office application, but no vendor can guarantee that any sort of homegrown app will work with this approach. Another important aspect is the impact on your storage backend when using application streaming:
As many admins have realized in the meantime, the storage system in your VDI solution is the component, that needs the most focus in order to get the sizing right and to guarantee satisfactory user experience. Just because of the fact that the vast majority of VDI deployments are I/O bound, more or less by definition. If you add application streaming to the mix, you gain more flexibility at the price of additional, expensive storage I/O.
What happens with application streaming in VDI is, that once a user opens an on-demand streaming application, the application is downloaded and prepared for execution to the client, which is the virtual desktop running typically on a shared storage. In other words the streamed application moves from a shared storage, the application repository, into the virtual disk of the virtual desktop, again on a shared storage. This causes significant network and storage traffic, mostly hitting the storage box that serves the virtual disks.
The impact through application streaming on the storage side is manifold. First you see that every streamed application instance will consume additional storage capacity. At least if you are using technologies such as thin provisioning or linked cloning. And besides the raw capacity needed it will take additional CPU cycles on your storage box away for processing the streaming load, it will take time on your spindles to write each and every streamed application instance. And also on the network side you have to factor in the additional storage traffic for streaming.
Of course, if your storage system supports offline deduplication, the initial storage capacity requirement will shrink eventually, when the offline dedub jobs starts eliminating duplicate blocks. But this is happening way too late, as capacity is typically not your problem in a VDI deployment. The limited resources are more the CPU, the cache, and the number of spindles of your storage box, all impacted directly by each and every streamed application.
The only thing that can give some relief in this situation is online deduplication. Currently there are no storage systems out, that provide this capability. But there is one filesystem, that includes this feature: ZFS. Online deduplication starts before any data is written to the spindles. It provides an index of blocks with references to all written blocks. At the moment when a block is about to be written to the disk, a lookup in the index is done. If the block already exists, only a reference counter is increased and no data is actually written to disk. You get the idea. There is also a great article from Jeff Bonwick on this subject for more details.
So with online dedup you will not experience additional need for capacity and need for more write IOPS as deduplication is handled before these things matter. However you still will encounter higher network load and some impact on the storage CPU. Once this functionality is available on SAN or NAS systems you can expect some very positive impact on scenarios that tend to write the same data many times to disk. A scenario such as the application streaming one.
That's it for today,