Honesty Comment: Haven’t I used VSAN yet. No. Been to busy. I’m hoping to return to a homelab soon where I will be deploying the 5.5 GA,and VSAN. The new lab will have servers with both HDD/SSD so I can play with vFlash, vSAN and also competitive products that offer similar functionality. So this blogpost is all about the theory of VSAN. I thought it would be helpful for me, and for others to put finger to keyboard. Somehow the writing process does help re-enforce the important info.
Last week I wrote about VSAN very quickly, but I did promise I would come back to the technology to delve a little deeper. To be honest this post perhaps should have come first, and that other post should have come afterwards. But what the hell. In case you missed the previous posts in the BloggerStorm which is VMworld week its over here:
What is VSAN?
In hindsight I should have perhaps kicked off with a more general overview of VSAN before hitting the settings and gotcha. Remember VSAN recently became an open public beta that’s something I am personally REALLY HAPPY about. Personally, I prefer open public betas to private/invite only ones. It gets more folks in front of a new technology – which both helps spreads knowledge/awareness/support in the community, but also provides a much large pool of feedback for product improvements/bugfixes than closed beta program would. I’m just sayin’.
xThis graphic on the right gives a thumbnail sketch of VSAN. VM’s get a storage policy which contains configuration requirements and allows VSAN to ensure that the correct settings are applied to the VM. VSAN is built-in to the vSphere/ESX hosts there no VIBS or virtual appliances to be installed. That’s quite important because it should in principle give VSAN in theory blistering performance when compared to approaches like require a virtual appliance to be deployed on top of the cluster. It’s for this reason that you can’t really compare vSphere VSA to VSAN…
Hosts added to the cluster enabled for VSAN are automagically scanned for their storage – and this is claimed by VSAN creating a single VMFS datastore for all the hosts in the cluster. A combination of SSD/HDD creates system which reads/writes by default to the SDD, with data being written behind to the underlying spinning disks. I guess you could say that this sort of architecture has already been tried and tested both technically and commercial by appliance vendors like Nutanix, Pivot3 and Simplivicity, and by storage array vendors such as Nimble and TinTri. Like TinTri, VSAN presents one single datastore – but unlike TinTri this is distributed across multiple nodes in the vSphere Cluster. Despite what everyone says about “VM Sprawl” as many organisations suffer “VM Stall” as they find traditional array based storage too costly, or won’t scale cost-effectively to their requirements.
The critical thing is the Storage Policy that allows you setup parameters covering capacity, availability and performance – once this is attached to a VM – if later you decide that VMs (or collection of VMs) require more performance or availability – you edit the policy. That’s in stark contrast to the Hardware-Defined Storage of the past where you would consider moving the VM to different datastore to achieve the same result.
VSAN currently scales to 8-nodes to a cluster. As you would expect VMware plans to go beyond this figure – as its inevitable as everyone will want as much scale as possible. When you build your first VSAN you’ll need to meet at least the minimum pre-requisites. So that’s at least 3 nodes in a cluster with the option to add additional nodes as you scale out the solution. VSAN uses a very small “witness virtual disk” (one per VM around 2MB) in size and access to this used in arbitrating the cluster should split brain occur. That’s the real reason for the 3-node minimum. If two nodes become disconnected from each other, the 3 node acts as the “witness node” – and caries the vote. See it as very democratic approach – albeit it in system where only three people are allowed to vote.
VSAN requires a VMKernel port/IP to handle the replication traffic and internal comms. 1Gps/10Gps will both work, but its likely that come GA VMware might state that best practise is to use 10Gbps. Remember, best practises are just that best practises. Shape them to your will. They’re not etched out on tablets of stone falling from the sky. VSAN is compatible with NIOC, so it would be viable to use it guarantee the bandwidth from two teamed 10Gps interfaces. As I said before adding a new node – will trigger VSAN to automatically enrol the storage to be used for the vSANDatastore. The important thing here is the physical disks hanging of the controller, need to be presented as RAW storage without any internal RAID controller card levels/volumes defined – this sometimes referred to as “PassThru” or “HBA” mode in various documentation. This essentially makes the ESX Host JBOD, with VSAN takes care of replicas which protect the VMs in case of server outage. This might play out well for me in my home lab. The boxes I’m looking at the moment have RAID Controller cards, but they aren’t recognised by vSphere – all it sees are disks hanging off the controller. Precisely the configuration that VSAN wants.
You could call it a Distributed RAID, as the redundancy is provided by the nodes in the cluster replicating to each – other rather merely stripping data across disks inside the server itself. Instead RAID is provided across the nodes making up the cluster. It is possible to add nodes to the cluster that are diskless and they can still access the vSANDatastore. I guess you would do that if you were running out of compute, before you ran out of storage capacity. I can see that over time folks are going to find sweet-spots that balance the cost of server/storage.
To get started with VSAN you will need at least one HDD and one SDD in each host. There are ways of fooling ESX into thinking a HDD device is actually an SSD device. This could be useful for testing purposes, or if your running ESX in a “nested” on vInception mode – where one virtualization layer (ESX, Fusion, Workstation) is being used to run another. This is popular in the community – building a big fat uberPC with truckload of RAM (and ideally SSD at the physical level) to then run a complete vSphere environment. Projects like Aliaster Cooke’s “AutoLab” and baremetalcloud.com offering are illustrations of easy-to-consume ways of doing this. Duncan Epping has good blogpost about making ESX think a HDD, is actually an SSD – Testing vSphere Virtual SAN in your virtual lab with vSphere 5.5. As for ratio of HDD to SDD the best practise is 1SSD for every 10HDD. In terms of spindle speeds, VSAN is optimised for the cheaper 7.2k disks, rather than more expensive 10k/15k spindle speeds. Wish fast spindle speeds are always welcome they do come at a premium – and I think the assumption is that 99% of the time the R/W activity will take place in cache, with the data de-stage to spinning rust to ensure persistance and capacity. Beware – that is NOT supported and performance would be degraded…
There’s two workflows really for VSAN – the initial one-time setup, and then repeated use of the feature – selecting a VM Storage Policy appropriate to the VM when creating it. You might come back to the initial setup routine to add node to the cluster, remove a node from the cluster or add/update a VM Storage Policy. The steps are these:
Initial Setup Workflow:
- Configure the ESX/vSphere host(s) for VSAN with VMkernel port
- Add/Update the HA/DRS Cluster to support VSAN
- Choose either manual or automatic (default). Manual allows you to select the disks you want VSAN to claim, automatic lets VSAN claim all the RAW/Blank storage it can see (not including the boot disk if you have booted from HDD)
- Next Think. What are your requirements for the VM’s storage? Once you have done thinking – create Storage Policies that match those different requirements. Never fear there is default Storage Policies that ship with the default settings
Everyday Usage Workflow:
- Kick off VM deployment
- Select the Storage Policy
- Select the single vSANDatastore for the cluster
- Watch the deployment
- Confirm that the VM Storage Policy requirements are being met on the newly deployed VM
Note: There’s a little gotcha in this release. Stage 2 merely checks that Storage Policy is correctly configured, it doesn’t validate that the underlying vSANDatastore meets those requirements. So it is possible to have a Storage Policy that says the VM files must be distributed on 4 separate nodes, when there’s only 3-nodes in the cluster. In such a scenario the VM would always have alarm triggered at step 5 because until a 4th node was add the policy requirements could never be physically matched. I figure VMware will want to improve on this functionality…
Storage Policy Configuration:
Storage Policies work in a similar way to Storage Profiles. I’ve come across Storage Profiles in my recent work with vCloud Director. There Storage Profiles are used to categories datastores or datastore clusters using simple profiles to classify storage as Gold, Silver, Bronze and so on. Storage Profiles can also leverage vendor supplied VASA plug-ins. These allow vSphere to interrogate the underlying storage array to discover the properties of the datastore (RAID Level, Replication Type and so on). Now, Storage Policies have a VASA provider which is built-in to the vSphere host in 5.5 – but is different from the VASA you normally get from your storage vendor. It specifically supports functionality or attributes that control performance and availability parameters. Unlike Storage Profiles that applied to datastores, Storage Policies are applied the VM. The VASA/Storage Policies support being applied to multiple VMs, and VMs can have multiple policies on the same datastore.
In my previous post I talked about the settings/attributes a Storage Policy. However, I didn’t talk about two parameters that control performance/availability – called “Number of Disk Stripes” and “Number of Failures to tolerate”. With “Number of Disk Stripes” by default VSAN takes the files of the VM and makes sure they are available elsewhere. By default this is enabled. It is possible to create a Storage Policy and turn it off. It’s hard to image WHY anyone want to do this except in a homelab environment where disk space was precious – and you just want to provide cheap storage to your environment. The “Number of Failures to Tolerate” this allows you express what level of resiliency you desire. For me this is similar to the “spinner” that we have had for some time in VMware HA. This allows you indicate whether you have N+1, N+2 redundancy. On a 3 node cluster this setting this to 1 would mean you would have 2 copies at least with the 3rd host being the “witness node” with access to the “witness virtual disk”.
If you wanted a policy to optimise performance for specific set of VM you would increase the “Number of Disk Stripes” and increase the “Read Cache Reservation”. Why? Despite the fact that by default reads/writes to SSD – at some point the contents of the cache need to be de-staged to spinning disk. The process of de-staging is improved by adding more spindles. Another possibility is that VM goes get a block of data from the cache but finds it not there – referred to as a “read-cache miss”. Increasing the size of the “Read Cache Reservation” increases the caches of the blocks being in SSD, rather than having to pulled from the disk, and then cached.
Maintenance Mode – Removing a host:
Maintenance mode changes once you have VSAN enabled – and offers three options “Ensure Compliance”, “Ensure Accessibility” and “Do not evacuate data”. With “Ensure Compliance” this totally empties the host of running VMs, and the storage as well. This leaves the host in a state valid for removal from the VMware Cluster and VSAN. The “Ensure Accessibility” evacuates the VMs, and just enough data so the VM keeps running. I imagine this will be MUCH quicker, and will be used to reboots for patching for example. The “Do not evacuate data” doesn’t both evacuating the host of VMs or data – and would be used if you want to remove the host from the cluster – and you want to delete the cluster.
What’s NOT Compatible:
For the most part VSAN is very compatible with existing vSphere features. There are some components that neither compatible or not-compatible but simple do not apply. So SIOC is N/A because performance is controlled by the Storage Policy; SDRS is irrelevant because VSAN presents a single datastore – and there would be nothing to migrate to (although I personally find it interesting that a host could be connected to both VSAN and SAN – that would cause a move to happen across the fabric which would be probably undesirable from a performance perspective); and DPM shouldn’t be enabled – as there circumstance where the VM maybe stored on one vSphere hosts, but running on another -what you wouldn’t want is the vSphere host owning the storage to be powered down.
In this release neither Horizon View, vCloud Director or 62TB disks are supported. That’s probably going to be addressed at a later date…