I’m back at the event this year, after taking a one-year sabbatical in 2017. I wasn’t working at the time, and didn’t think my bank balance could afford the “VMworld Hit”. Now I’m back in the saddle work-wise, I thought it would be good to catch-up with my former colleagues at VMware, and say hello to friends within the community. Shortly before VMworld 2018 kicked off, myself and my fellow vExperts were briefed on the some of the key announcements surrounding VMware prior to the event. This is pretty typical of many of these programs, and the content was embargo’d until closer to the VMworld itself. One of these sessions was focused on the enhancements to VSAN in the vSphere6.7 U1 release.
The improvements can be broken down into three categories – Simplified Operations; Efficient Infrastructure and Rapid Support Resolution. There is nothing jaw dropping “wow” about these increments and taken on their own they amount to tiring up of loose ends to some degree. However, loose ends do have a tendency of tripping people up, and you’ll be surprised how often from an operational perspective, these issues collectively can bond to together to undermine customer experience and satisfaction. So they are not to be underestimated. Also I would say from my experience these issues are often much harder to resolve and deliver than many customers really give credit for. Trust me, if there was an easy fix – everyone would leap on it. The fact that doesn’t happen immediately is often because once you pull back the lid of the tin, there’s a mass of complexity or politics to be first resolved.
VUM and VSAN play nice: VMware Update Manager has had a major uplift, and now integrates with VSAN. This means ESXI, drivers and firmware are all update by VUM, as well as supporting Custom OEM based ISO images. These updates are based on the HCL, and ensure the correct drivers and firmware is in place for those all important pass-through controllers. Support has been added for the Dell HBA330’s, and the whole update process can be carried out in an offline fashion, without connectivity to internet.
Improved QuickStart UI: For those new to VSAN, a new “Quick Start” UI has been developed into the HTML5 Client. There are new workflows, making it easier to add new or existing host in a bulk-fashion (something that various VMware UI have struggled with in the past). There improved ‘pre-checks’ and recommendations which flag up issues before the cluster instantiated rather than leaving you with bunch of errors after the fact. This new UI is seen as compliment to the “Easy Install” wizard starting with a single host.
Enter Maintenance Mode (EMM) Enhancements: I’m sure you’ve all had that experience where you click maintenance mode and it seems to get stuck at 2% because of some wider issue. You wind up cancelling it to investigate and re-examining the settings. vSphere 6.7 U1 adds a “Fail Fast” pre-check which simulates the maintenance mode without actually moving data. This VSAN-aware maintenance mode (if you like) has new alerts and alarms, that warning you about specific VSAN process OR events that might impact on how long the maintenance mode will take such as hosts already in maintenance mode, ongoing re-syncs, and if there is any repair work going on.
Capacity History/Capacity Estimator: There new historical capacity views which show total, used and free capacity over time, as well as a history of de-duplication and compression ratios (assuming your brave enough to turn these features on – remember they are not enabled by default). The usable Capacity Estimator digests these stats, and re-displays them based on what the selected storage policy is – giving admins a handy “what if” guideline on the consequences of changing the policy settings.
vRealize Operations VSAN Stretched Cluster Awareness: VSAN has support a stretched cluster configurations (one cluster spanning two-sites) for sometime, however, other operations tools such as the pithily named “vRealize Operations” has displayed that functionality in its interface. With version 7.0 of VROPS this now a recognised configuration.
PowerCLI your VSAN. Starting with PowerCLI 10.2 commands that previously only available in the Ruby-based RVC command-line are now easier to access. This could spell the end of having to SSH to the vCenter Appliance, and go thru a rather obtuse logon process to carry out tasks. Some 18 RVC style tasks have been ported into PowerCLI. including gathering cluster info, resync stats info, health checks, VSAN object info and status, as well as VSAN Disk Stats.
Guest and VSAN Space Reclaim: As we all know a delete doesn’t delete, and such free space isn’t really handed back to various systems without some sort secondary trim/unmap process. In the past these have been somewhat difficult to reach, and seen as a bit dangerous in production hours because the impact on IOPS. VMware are supporting are support space reclaim within the guest operating system for Windows Server 2012, Windows 8 or newer, as well as Linux using ext4, xfs or btrfs. The process can be done online and scheduled – no doubt for the same reason we have in the past.
Mixed MTU Size for Stretched Clusters: Does what is says on the tin. VSAN supports defining MTU this when selecting/configuring VMKernel traffic types. It means jumbo frames can be configured between hosts in different sites – and at the same time specifying a different MTU value the “witness traffic” for the witness host is in a different locations potentially without the capacity to have a consistent MTU value across all the locations in the configuration.
Nested Fault Domains: This feature is focused specifically at non-stretched clusters, and its intention is to offer protection for rack/floor failure, as well as individual nodes. It’s applied through storage policies using the “Fault Domain Failures to Tolerate” setting (the snappy abbreviation is FDFTT!). Essentially, this ensures that the replica(s) and witness are not permitted to reside on more than one rack/enclosure.
Rapid Support Resolution:
Improved Health Checks/Recommendations: Health Checks and Recommendations have been important part of VSAN for sometime, and these have been enhanced. The Storage Controller firmware health check will now allow for multiple approved firmware levels (rather than just one). There’s also a unicast network performance check as well. New controls have have been added to the HTML5 Client, which allows you to silence false positives in the health check (previously this was only accessible within the Ruby Console).
In Product Support Diagnostics: This assist GSS in resolving customer cases more efficiently. Its specialised dashboard which replaces and depreciates the older “VSAN Observer”, and reduces the need to gather support bundles and upload them to VMware. It’s enabled upon request by GSS as the on-demand network diagnostics has 1sec sampling rate.