Just a few days ago I was lucky enough to be a delegate to the 3rd Storage Field Day event organized by Gestaltit.
We had presentations from various well-established companies as well as from a few startups. Most presentations were excellent and we were able to have a deep dive into the details of various products.
The presentations that made the most impression on me were those by PernixData and SanDisks FlashSoft. To be honest, at first I didn’t have high expectations of SanDisks FlashSoft, since I was not aware SanDisk had enterprise products. I still thought of them as a manufacturer of consumer products. A good thing this presentation could teach me otherwise.
For a while now we can see the new trend of sticking SSD drives or FLASH cards in servers that must lead to extreme high IO with minimal latency. In most cases this worked excellently, given the configuration or application was tuned for such usage. Although the performance can be exceptionally high, it hasn’t gotten much traction yet. In enterprise environments, it is quite normal to have the centralized storage capacity replicated to another location for availability or recoverability purposes. The FLASH and SSD that was local to the server couldn’t be replicated unless the server or application would take care of that. The downside is that there is no coherency or consistency between the local SSD or FLASH storage and the back-end storage array, rendering the replication useless.
Another problem is that the local SSD or FLASH storage cannot be shared with another server or application in case of clustering. A failover configuration was not possible.
The SanDisk FlashSoft and PernixData are quite different in this area. They use the SSD or FLASH storage as a cache device. The server holding the SSD or FLASH device can achieve an enormous performance improvement, and still make sure all data will eventually be flushed to the back-end arrays so all the data will be coherent in the array. Using the SSD and FLASH devices as a cache device will impose a problem if the data resides in the cache long, without it being destaged to the back-end. The data in the back-end will be out of sync, and an outage will result in data corruption and replication is pointless.
That’s why there are a few modes of write caching.
- Write back cache, is where data is stored in cache and a write complete acknowledgment is returned to the application without actually writing the data to the back-end storage media. Data is at risk if an outage occurs before the data is actually written to back-end media. A huge benefit however can be that multiple writes can be combined or consolidated and ordered, thereby preserving valuable IOs to the back-end. Read IOs can also be served from the cache also preserving IOs to the back-end (depending on read cache hit of course).
- Write through cache, is where the write complete acknowledgement is not given to the application until the write IO is safely written in the back-end storage device. There is no real performance benefit on the first actual write, but subsequent reads might be served from cache, if the written blocks are also stored in the local SSD or FLASH cache devices. This method is safer than write back cache if you want to replicate the back-end storage or share the back-end storage with other hosts as you would in a cluster.
Now, back to the presenters and their products.
Pernix and FlashSoft are software companies. They support a number of SSD or FLASH devices in the server that can be used as a caching device. Both products are very similar to each other, but differ on some points. As time passes, these differences will be less as both products will evolve and mature more and get more features.
PernixData was co-founded in Februari 2012 by Satyam Vaghani, the brains behind at least VMWare VMFS and VAAI and Poojan Kumar, who also worked for VMware and Oracle before that. Satyam knows exactly how VMware works and has very deep knowledge of VMware and all storage principles related to VMware. But don’t make the mistake of underestimating SANdisks role in this arena. SanDisk acquired California based start-up FlashSoft in February 2012 and have added it to complete their stack of enterprise products. This now ranges from various hardware platforms to software. FlashSoft is a working product, whereas PernixData still is in beta.
PernixData is installed on all ESX hosts and the management and reporting utility is installed as a plugin into vCenter. The same applies to FlashSoft, but a separate management and policy utility is installed on a server outside vCenter, although this server might as well be the vCenter server itself.
Both PernixData and FlashSoft are kernel mode addons that are loaded into the VMware hypervisor kernel. Installing these modules is a non-disruptive operation and after installation acceleration can immediately be enabled, provided supported SSD or FLASH devices are already available in the ESX hosts.
PernixData needs to be installed on all servers in the ESX cluster but not all ESX hosts need to have SSD or FLASH devices installed. In vCenter you will then be able to create a Flash Cluster resource in which all hosts with SSD or FLASH devices can be grouped. When accelerating volumes (ESX LUN’s) or VMDK’s or VMFS’s you can select a protection level of 0 for no cache replication, 1 for a single remote copy on an SSD or FLASH on another server in the cluster if available. If you select 2, you can even create 2 copies on 2 different servers in the cluster if available or on another server with copies on different flash devices if available. SANdisk FlashSoft isn’t at this level of high availability yet, but is getting there in the next release. The way PernixData and FlashSoft solve the failover details of cache coherency differ. Unfortunately FlashData embargo prevents me from revealing those details.
As for SSD or FLASH devices PernixData and FlashSoft slightly differ too. PernixData support all SSD and FLASH devices that are on the VMware Hardware Compatibility List, but FlashSoft currently supports most well known devices. For a detailed list of FlashSoft supported devices I would refer you to their site, but the information doesn’t seem to be publicly available.
The clustered cache accelerator is a true clustered feature with HA failover support, vMotion and all. If a VM is vMotion’d to another ESX host in the cluster, with a SSD or FLASH acceleration device, the cached data will also be migrated to the other host over the ESX network interfaces. This might have a slight delay or cause a slight increase in latency. This allegedly is not noticeable by the VM. If the failover or vMotion goes to or from a server without a cache acceleration device this is still possible and supported. As I said, FlashSoft isn’t at this level yet, but are getting there on a short notice.
For a single non-clustered system with acceleration FlashSoft will work in write-back mode, but in a clustered configuration FlashSoft needs to be in write-trough mode. But again, the future holds new cool features.
Currently, PernixData and FlashSoft only works for VMware vSphere 5.0 and up, but FlashSoft also works with certain Windows and Linux bare metal servers. Both vendors have no support Microsoft Hyper-V yet, but plans are that eventually Hyper-V will also be supported. It’s quire obvious that VMware is by far the biggest market.
Both PernixData and FlashSoft have made sure that not a single change has to be made to the VM’s or applications for the acceleration to work. So the acceleration is completely transparent to the host (bare metal or VM) and no configuration change or agent software is needed.
At this time, acceleration is only possible on block IO devices. File level acceleration might be a thing of the future, where VM’s or Databases on NFS might see massive improvements in performance. This is just a small market in comparison, so I think not much priority is given to this type of acceleration.
Why use Flash Cache acceleration?
The technology behind SSD or FLASH acceleration will enable IO consolidation to be performed on a host and therefore lower the amount of IOs to be fired at the back-end array. This will lower the overall load on the storage arrays by a significant percentage. You will then be able to postpone investing in a newer or faster storage array or increase the total load in IOs on a host or ESX cluster. You will achieve a higher utilization on existing arrays and servers without upgrading in that area.
You might want to check this in a cost perspective though. The acceleration software isn’t free. Without having the exact figures I would guess MSRP is about $3000,- to $5000,- per host and then you need the SSD or FLASH devices which go for any number between $2000,- to as high as you like per device, depending on your capacity needs.
You will eventually need to compare the cost of SSD or FLASH acceleration to the cost of upgrading your storage array.
Both companies are very similar although some slight difference is there. The difference is temporarily however, since both companies are working hard on improvements and new features. You would think the creator of the VMWare VMFS stack and VAAI would have the greatest advantage and PernixData would have the brightest future in this field. But don’t forget SanDisk is a fortune 500 company with huge cash flow and team of developers. They too see the value of this market and are working eagerly to secure their part. Don’t forget, the all major storage vendors will have SSD or FLASH acceleration software in the end. So you might as well check your preferred vendor for their development in this area. Just be sure to give them a thorough investigation. Use all information you can find to make sure your vendor’s product doesn’t suck.
If you are looking into SSD or FLASH acceleration, make sure you read all you can about it. You could start with these blogs and videos.
Disclaimer : I was invited to SFD3 and all travel and accommodations were paid for. I was not compensated for the time spend at SFD3 nor was I obligated to write about the sponsors and / or presentations. I did so by my own decision and I have written my own perception of the SFD3 event.