Yes, I know it sounds a bit weird to have a “what’s new” post on a new product – but in effort to keep these posts together it seemed to make sense. Beside which this post is more than just a round up of futures – and more a decision about what vSAN is, what is capable of, and what is not capable of…
vSAN is brand new product from VMware, although it got its first tech preview back in VMworld last year. That’s why I always think if your attending VMworld you should always search for and attend the “Tech Preview” session. We tend not to crow on about futures stuff outside of a product roadmap and a NDA session – so the Tech Previews are useful for those people outside of that process to get a feel for the blue-sky future.
So what is vSAN? Well, it addresses a long time challenge of all virtualization projects – how to get the right type of storage to allow for advanced features such as DRS/HA; deliver the right IOPS – at the right price point. In the early days of VMware (in the now Jurassic period of ESX2.x/2003/4) the only option was FC-SAN. That perhaps wasn’t a big ask for early adopters of virtualization in the Corporate domain, but it rather excluded medium/small business. Thankfully, Virtual Infrastructure 3.x introduced support for both NFS and iSCSI, and those customers were able to source storage that was more competitive. However, even with those enhancements its still left businesses with storage challenges dependent on the application. How to deliver cost-effective storage to Test/Dev or VDI projects, whilst keeping the price point low. Of course, you could always buy an entry-level array to keep the costs down, but would it offer the performance required? In recent years we’ve seen a host of new appliance lead start-ups (Nutanix, Simplivicity, and Pivot5) offer bundles of hardware with combos of local storage (both HDD; SDD and in some case FusionIO cards) in effort to bring the IOPS back to the PCI bus, and allow the use of commodity based hardware. You could say that VMware vSAN is a software version of this approach. So there’s a definite Y-in the road when it comes to this model – do you buy into a physical appliance or do you “roll-your-own” and stick with your existing hardware supplier?
You could say vSAN and it competitors are attempts to deliver “software-defined storage”. I’ve always felt a bit ambivalent about the SDS acronym. Why? Well, because every storage vendor I’ve met since 2003 has said to me “Were not really hardware vendors, what were really selling you is software”. Perhaps I’m naïve and gullible, and have too readily accepted this at face value, I don’t know. I’m no storage guru after all. In fact I’m not a guru in anything really. But I see vSAN as an attempt to get away from the old storage constructs that I started to learn more and more about in 2003 with me learning VMware for the first time. So with vSAN (and technologies like TinTri) there are no “LUNs” or “Volumes” to manage, mask, present and zone. vSAN present a single datastore to all the members of the cluster. And the idea of using something like Storage vMotion to move VMs around to free up space or improve their IOPS (by moving a VM to a big or fast datastore) is large irrelevant. That’s not to say Storage vMotion is a dead-in-the-water feature. After all you may still want to move VMs from legacy storage arrays to vSAN, or move a test/dev VM from vSAN to your state-of-the art storage arrays. As an aside it’s worth saying that Storage vMotion from vSAN-to-Array would be slightly quicker, than from Array-to-vSAN. That’s because the architecture of vSAN is so different than conventional shared storage.
vSAN has a number of hardware requirements – you need at least 1 SSD drive, and you cannot use the ESX boot disk as a datastore. I imagine a lot of homelabbers will chose to boot from USB to free up a local HHD. You need not buy an SSD drive to make vSAN run on your home rig. You might have noticed both William Lam and Duncan Epping have shown ways of fooling ESX into thinking a HDD drive is SSD based. Of course, if you want to enjoy the performance that vSAN delivers you will need the real deal. The SSD portion of vSAN is used to purely address the IOPS demands – it acts as cache only storage layer. With data written to disk first, before it’s cached to improve performance, and reduce the SDD component as single point of failure.
I don’t want to use this blogpost to explain how to setup vSAN or configure it. But to highlight some of the design requirements and gotchas associated it with it. So with that lets start with use cases. What is good for, and what is not good for.
vSAN Use Cases
Because vSAN uses commodity based hardware (local storage) one place where vSAN sings is in the area of virtual desktops. There’s been a lot of progress in storage over the last couple of years to reduce the performance and cost penalty of virtual desktops both from the regular storage players (EMC, NetApp, Dell and so on) as well as host of SSD or hybrid storage start-ups (TinTri, Nimble, PureStorage, WhipTail etc). All the storage vendors have cottoned on that the biggest challenge of virtual desktops (apart from having quality images and having good application delivery story!) is storage. They’ve successfully changed that penalty into an opportunity to sell more storage. I’ve often felt that a lot engineering dollars and time has been thrown at this problem, which is largely by design. Even before VDI took off, storage has been systematic/endemic issue. The hope is that new architecture will allow genuine economies of scale. I’m not alone in this view. In fact even avowed VDI sceptics are becoming VDI-converts (well, kind of. Folks do like a good news headline that drives traffic to their blogs don’t they? I guess that’s the media for you. Never let the truth get in the way of a good story, eh?)
The second big area is test/dev. The accepted wisdom is that test/dev does need high-end performance. Good enough will do, and we should save our hardware dollars for products. There is some merit in this, but its also the case that developers are no less demanding as consumers, and there are some test/dev environments that experience more disk IOPS churn than production platforms. There’s also a confidence factor – folks who experience poor responsiveness in a test/dev environment are likely to express scepticism about that platform in production. Finally, there’s ever-present public cloud challenges. Developers turn to the public cloud because enterprise platforms using shared storage – require more due diligence when it comes to the provisioning process. Imagine a situation where developers are silo’d in a sandbox using commodity storage miles away from your enterprise-class storage arrays, demarcated for production use? The goal of vSAN is to generate almost a 96% cache hit-rate. That means 96% of the time the reads are coming off solid-state drives with no moving parts.
Finally, there’s DR. vSAN is fully compatibly with VMware’s vSphere Replication (in truth VR sits so high in the stack it has no clue what the underlying storage platform – it will replicate VMs from one type of storage (FC) to another (NFS) without care). So your DR location could be commodity-based servers using commodity-based hardware.
So its all brilliant and wonderful, and evangelist like me will be able to stare into people foreheads for the next couple years – brainwashing people that VMware is perfect, and vSAN is an out-of-the-box solution with no best practises of gotchas to think of. Right? Erm, not quite. Like any tech vSAN comes with some settings you may or may not need to change. In most cases you won’t want to change these settings – if you do – make sure your fully aware of the consequences….
Important vSAN Settings
Firstly, there’s a setting that controls the “Read Cache Reservation”. This is turned on by default, and the vSAN schedule will take care of what’s called the “Fair Cache Allocation”. By default vSAN makes a reservation on the SSD – 30% for reads, and the rest for writes. The algorithms behind vSAN are written to expect this distribution. Changing this reservation is possible, but it can include files that have nothing to do with the VM – such as the vmx file, log files and so on. The reservation is set per-VM, and when changed it includes all the files that make up a VM. Ask your self this question – do you really want to cache log files, and waste valuable SSD space as consequence? So you should really know they IO profile of a VM before tinkering with this setting. Although the option is there, I suspect many people are best advised to leave it alone.
Secondly, there’s a setting called “Space Reservation”. The default is that is set to 0, and as a consequence all the virtual disks provisioned on the vSAN datastore are set to be thinly-provisioned. The important thing to note is from a vSAN perspective virtual disk formats a largely irrelevant – unless the application requires them (remember guest clustering and VMware Fault Tolerance require the eagerzeroedthick format). There’s absolutely no performance benefit to using thick disks. That’s mainly because of the use of SSD drives, but also it’s a grossly wasteful use of precious SSD capacity. What’s the point of zeroing out blocks on a SSD drive, unless you a fan of burning money?
In fairness you might be sort of shop that isn’t a fan of monitoring disk capacity, and your paranoid about massively over-committing your storage. At the back of your mind you picture Wild E Coyote from the cartoons – running of the end of a cliff. My view is if your not monitoring your storage, the whole thin/thick debate is largely superfluous. The scariest thing is your not monitoring your free space! You must be really tired from all those sleeplessness nights your having worrying if your VMs are about to fill up a datastore!
Finally, there’s a setting called “Force Provisioning”. At the heart of vSAN are storage policies. These control the number of failures tolerated, and the settings I’ve discussed above. What happens if a provisioning request to create a new VM is made, but it can’t be matched by the storage policy? Should it fail, or should it be allowed to continue regardless? Ostensibly this setting is there for VDI environments where a large number of maintenance tasks (refresh and recompose) or the deployment of a new desktop pool could unintentionally generate a burst of storage IOPS. There are situations where the storage policy settings mean that these tasks would not be allowed to proceed. So see it as a failback position. It allows you to complete your management tasks, and once the tasks has completed vSAN would respond to the decline in disk IOPS.
Gotchas & Best Practises
Is vSAN ready for Tier 1 Production Applications? As ever with VMware technologies – so long as you stay within the parameters and definition of the design you should be fine. Stray out of those, and start using it as a wrench to drive home a nail, you could experience unexpected outcomes. First of all usage cases – although big data is listed on the graphics – I don’t think VMware is really expecting customers to run Tier 1 applications in production on vSAN. Its fine for Oracle, SAP and Exchange on vSAN within the context of a test/dev environment – and of course, our long time goal is to do precisely that. But you must remember that vSAN is a 1.0 release, and Rome wasn’t built in a day. Whenever I’ve seen customers come a cropper with VMware Technologies (or any technology for that matter) is when they take something that was designed for Y, and stretch it to do X. Oddly enough when you take an elastic band and use it to lift a bowling ball it has a tendency to snap under the load….
Don’t Make a SAN out of vSAN: The other thing that came out of beta testing was a misunderstanding in the way customers design their vSAN implementation. Despite the name, vSAN isn’t a SAN. It’s a distributed storage system, and designed as such. So what’s a bad idea is this: Buying a number of monster servers, packing them with SSD – and dedicating them to this task – and then presenting the storage to a bunch of diskless ESX hosts. In other words building a conventional SAN out of vSAN. Perhaps vSAN isn’t the best of names, but if I remember the original name was VMware Distributed Storage. I guess vSAN is catchier as product name than vDS! Now it maybe in the future that is a direction vSAN could (not will) take, but at the moment this is not a good idea. vSAN is designed to be distributed with a small amount of SSD used as cache, and large amount of HDD as conventional capacity based hardware. It’s also been designed for HHD’s that excels in storage capacity, rather than spindle speeds – so its 7K disks, not 15K disks for which its been optimized. So a vSAN made only from SSD won’t give you the performance improvements you expect – but it will give you an invoice that will makes you wince!
VMware HA. Once again a new innovation from VMware has necessitated an overhaul in VMware’s clustering technology. VM resides on an ESX host, and so does its files. The whole point is keeping the resources of the VM close to each other – memory, CPU, network and now disk are all within the form-factor of a server or blade. But what if a server dies? What then? If an ESX host fails is put into maintenance mode then that will trigger either graceful evacuation of the host or disgraceful one. When the host comes back online it not only re-joins the HA/DRS cluster it also re-joins the vSAN as a member. Now if it’s a maintenance mode then a rebuild begins. In the beta this was delayed for 30mins, but under testing it has been extended for an hour. This is to avoid spurious rebuilds that were not required – by rebuild we mean the metadata/data that backs an individual node that has been down for a period receives delta updates. I guess the analogy would be if you shutdown a Microsoft Active Directory Domain Controller for an hour or so, when it came back up it would trigger a directory services synch. The important thing from a virtualization perspective is we want the ESX host to complete this synch successfully before DRS starts repopulating the host with VMs. Now think about that for second. After a reboot (for what ever reason) an ESX hosts now takes 1hr before its joins the cluster. Therefore you may need to factor in additional ESX host resources to cover this period. The operative word is “may”, not “must”. I think much depends on the spare capacity you have left over once a server is unavailable due to an outage or maintenance. The situation is a bit different if the problem is detected as a component failure. If there is a disk failure or read error then vSAN doesn’t wait an hour. The rebuild process begins immediately.
When is commodity hardware, commodity hardware? This is one for labbers who might want to run vSAN at home. But it could be relevant to a vSAN configured at work too. I’ve been looking into moving back to a home lab. That means buying commodity hardware. Right now I’m very attracted to the HP ML350e series. It supports a truckload of RAM (196GB Max with two CPUs), although its big and expensive compared to white boxes, and Shuttle XPC. The reseller offered me a choice of disks. The hot-pluggable ones from HP are proprietary and pricey. The more generic SATA drives are not hot pluggable, and are much cheaper. For my home I know what I will be choosing. The other thing I need to think about is what capacity and ratio of HDD and SSD I need for my lab. There could be tendency of over spend. After all my plan calls for the use of Synology NAS which I hope to pack with high-capacity SSD. Although I want to use vSAN I can imagine in my very volatile lab environment (where its built and destroyed very frequently) having my “core” VMs (domain controller, MS-SQL, View, Management Virtual Desktops) on external storage might give me peace of mind should I do something stupid….