Upgrading from SRM 4.1 to SRM 5.0

From vmWIKI
Jump to: navigation, search

Originating Author

Michelle Laverick

Michelle Laverick.jpg

Video Content [TBA]

Upgrading from SRM 4.1 to SRM 5.0

Version: vCenter SRM 5.0

Throughout my career I’ve been asked this one question over and over again: “What works best—an upgrade or a clean install?” Without fail the folks asking this question nearly always know the answer. What they really are asking me is whether I think the upgrade process is robust enough to justify its use. I’ve always been happy to upgrade within releases—say, from 3.0, 3.1, 3.5, and so on. My personal rule has always been to carry out a clean installation when the version’s major number changes—from Windows 2003 to Windows 2008, for example. With that in mind, I think it’s perfectly feasible to have a combination of upgrades and clean installs. It makes sense that VMware customers would want to upgrade their vCenter environment. Even with a relatively small environment it takes some time to add all the hosts, create clusters, potentially configure resource pools, and create folders. When it comes to other components like ESX hosts, I just don’t see why people would want to upgrade a hypervisor. In an ideal world, there should be no data that isn’t reproducible. I speak as someone who has invested a lot of time in scripted installations for both ESX and PowerCLI scripts for vCenter. In the course of a year, my labs are endlessly rebuilt and it makes no sense to do that work constantly by hand. So when I am asked that question, I often respond, “Why do you ask me a question you already know the answer to?” The fact is that clean installations are destined to always provide a high ratio of success since there is no existing configuration data to be ported to a new environment. Given the complexity and flexibility available in most technologies, it seems inevitable that more complications are likely to arise from upgrades than from clean installs, but to say so seems to be both a truism and rather trite. Personally, I’m one of those loony guys who upgrades as soon as a new release comes out. For example, I upgraded from Mac OS X Leopard to Lion on the very day Apple shipped the new version of its OS. I do this largely from an academic interest in knowing about pitfalls and perils long before any of my customers embark on the same path some months (and in some cases years!) later. I regard this as a foolhardy approach from a production environment where upgrades must be carried out cautiously and judiciously. This chapter is not intended to be a thoroughly comprehensive guide to upgrading from vSphere 4 to vSphere 5; for that I recommend you consult other literature that covers this process in more detail. Nonetheless, it would be difficult to explain the SRM upgrade process properly without also covering the vSphere 4 to vSphere 5 upgrade path. As you know, SRM 5.0 requires vSphere 5, and if you were previously a customer of SRM 4.1 you would need to carry out both a platform upgrade as well as an upgrade to SRM 5.0 to be on the latest release. A successful upgrade begins with a backup of your entire vSphere environment, including components such as SRM, View, and vCloud Director. Your upgrade plan should also include a rollback routine should the upgrade process falter—and I recommend taking snapshots and backups along the way so that you can roll back in stages during the upgrade. I make regular use of VMware snapshots in my upgrade processes; that’s because everything I do is based on virtual machines. In some cases I will take the extra precaution of taking hot clones of critical VMs just in case there is an issue with the snapshot once it has been reverted.

Like all upgrades, the process follows a particular order. Many steps are mandatory, and some are optional, and they must be undertaken in the right order to ensure that the requisites for the upgrade are successful. You might also want to consider the possibility that a clean installation may actually be the fastest route to your goal. So, if you need SRM for a particular project, and lead-in times are narrow, you might find a fresh install meets your needs more directly. It’s also worth stating that individual steps in the upgrade process may not necessarily require or support an in-place upgrade. For instance, many of my customers prefer to evacuate an ESX host of all its VMs using maintenance mode, and then execute a clean scripted installation of the ESX server software. Most of these customers have invested a lot of time and effort in automating their ESX host builds. This pays off in a number of places: consistency, rebuilding hosts if they fail, and upgrading ESX from one release to another. As for this chapter, you will find that I will take the in-place approach wherever possible—although in the real world an upgrade process might actually use a combination of methods.

As stated previously, a successful upgrade begins with a backup and carrying out the steps in the correct manner. If you want to back up your existing SRM 4.1 you can do this by backing up the SQL database that backs the installation, together with the vmware.xml file that defines some of the core configuration settings. You should also consider a backup of any scripts from the SRM server, together with any ancillary files that make the current solution work—for example, you may have a .csv file that you use for the re-IP of your VMs. You should also thoroughly test that your existing Recovery Plans and Protection Groups work correctly, because only the Protection Groups and Recovery Plans that are in a valid state are saved during the upgrade. If they are not in a valid state they are discarded.

Below is a checklist that outlines the stages that are required. I’ve marked some as optional, which means they can be skipped and returned to later. I would have to say that my attitude regarding what is optional and what is required is a very personal one. For instance, some people regard an upgrade of VMware Tools as required, while others would see it as optional, as VMs will continue to function normally even when they are out of date. The way I look at upgrades is that in the main, customers want to install the upgrade in the shortest possible time. If there are any non-mandatory steps, they can defer the upgrade until a later date, and perhaps roll out over a longer period. This stealthier approach to upgrades limits the steps to just the bare minimum required to become functional.

1. Run the vCenter Host Agent Pre-Upgrade Checker (optional).

2. Upgrade vCenter.

3. Upgrade the vSphere client.

4. Upgrade the vCenter Update Manager.

5. Upgrade the Update Manager plug-in.

6. Upgrade third-party plug-ins (optional).

7. Upgrade ESX hosts.

8. Upgrade SRM.

9. Upgrade VMware Tools (optional).

10. Upgrade the virtual hardware (optional).

11. Upgrade the VMFS volumes (optional).

12. Upgrade Distributed vSwitches (optional).

Upgrading vSphere

The first stage in upgrading SRM is to upgrade vSphere from version 4 to version 5. In all of my upgrades throughout my career I’ve always ensured that I was just one version revision away from the new release and upgrade release. It follows that the closer the revisions the greater your chance of success. There are two reasons for this. First, the deviation of difference between the releases is kept to a minimum, and in my experience vendors plough most of their testing efforts against upgrading from the immediate release prior to the new release; the farther back your platform is from the new release the greater the chance that problems can occur. Second, if you keep on top of patches and maintenance of vSphere 4 you won’t face a double-upgrade—in other words, having to upgrade vSphere 4.0 to vSphere 4.1, for example, before upgrading to vSphere 5. As soon as your build lags massively behind the current release this can be a compelling reason to cut to the chase and opt for a clean installation.

I would begin the upgrade process at the Protected Site, and complete those tasks there initially. After you have upgraded vCenter from version 4 to version 5, you will find the SRM 4.1 service will no longer start or function as SRM 4.1 is not forward-compatible with vCenter 5. If the Protected Site experiences an unexpected disaster, you would still have a valid vCenter 4. 1/SRM 4.1 configuration at the Recovery Site from which you could still run a Recovery Plan. The Protected and Recovery Sites would not be able to communicate to each other because of this version mismatch, and the pairing between the sites would be lost, with the Protected Site marked as “Not Responding” in the SRM interface (see Figure 16.1).

Upgrading-srm- (01).jpg

Figure 16.1 The New Jersey site, based on SRM 4.1, cannot connect with the New York site, which was upgraded to vSphere 5 and SRM 5.

Step 1: Run the vCenter Host Agent Pre-Upgrade Checker

Before embarking on a vCenter upgrade I heartily recommend running the upgrade checker that ships on the .iso file for vCenter. The utility actually updates itself from VMware.com and as such keeps track of new upgrade issues as they are discovered. Run the utility before you start the vCenter/ESX portion of the upgrade. When the utility has run you will be asked to select the DSN used to connect the vCenter together with the database credentials. Once this has passed, the checker interrogates the vSphere environment, including the ESX hosts (see Figure 16.2).

Upgrading-srm- (02).jpg

Figure 16.2 A successful outcome from the upgrade checker utility

Step 2: Upgrade vCenter

Before upgrading vCenter you should uninstall the Guided Consolidation feature of vCenter if you have installed it. Guided Consolidation has been deprecated in the vSphere 5 release, and upgrades will not proceed if it is installed in the current vCenter. There’s a very good chance that this will not be the case; for some reason the Guided Consolidation feature didn’t catch on among my customers or peer group. I think this was largely because many had already P2V’d their environments some time ago. This has been an issue for some time and did affect some vSphere 4.1 upgrades. In fact, I had problems uninstalling the Guided Consolidation feature from my test vCenter 4 environment that I created for this chapter. I had to refer to VMware KB 1025657 (http://kb.vmware.com/kb/1025657) to correctly remove it from the system. During my build of my vCenter environment I had to change the Windows drive letter used for the DVD/CD from D: to E:. I found that the easiest way to resolve this issue was to swap the drive letters around on the vCenter, and then the uninstall of Guided Consolidation worked perfectly without a problem.

The other issue you will need to consider is what user account to use for the vCenter upgrade. vCenter 4 introduced full support for Windows Authentication to Microsoft SQL—something I’ve wanted for some time—so you may wish to remind yourself of how vCenter was installed initially so that you use the same service account for the upgrade. In my case, I created an account called vcdbuser-nyc, and gave that account local administrator rights on the vCenter server. This same account was used to create the Microsoft SQL DSN configuration using Windows Authentication. So, before mounting the vSphere 5 .iso file, I logged in to the same account I had used for my initial installation.

Once you have worked your way through the usual layers of the media UI, you should see a message indicating that the installer has found that a previous release of the vCenter Server software is installed (see Figure 16.3).

Work your way through the usual suspects of the EULA and so forth, and you will have an opportunity to input your vCenter 5 license key. It is possible to carry on with the upgrade without this license key; without a license key your vCenter 5 system will be in evaluation mode for 60 days. After accepting the default settings for the database options, you will receive a warning indicating that the installed version of Update Manager is not compatible with vCenter 5, and will itself need upgrading at some later stage (see Figure 16.4).

Upgrading-srm- (03).jpg

Figure 16.3 The vCenter 5.0 installer has detected that vCenter 4.1 is already installed.

Once this warning is confirmed, the system will ask you for confirmation that you wish to upgrade the vCenter database from one suitable for version 4 to one suitable for version 5 (see Figure 16.5). If you have ever carried out an intermediary upgrade of vCenter from one sub release to another you should recognize this dialog box, as it has been present in vCenter upgrades for some time. Once this is confirmed, you will be asked if you would like the system to automatically upgrade the vCenter Agent that is installed in every ESX host managed by vCenter (see Figure 16.6).

Next, you will be asked for the password of the service account used to start vCenter (see Figure 16.7). If you use Windows Authentication with Microsoft SQL, this will be the same account used to create the DSN and start the upgrade. If you’re using Microsoft SQL Authentication you may need to provide separate database and service account credentials to complete this process. All in all, Windows Authentication is much simpler and more secure, and I switched to it as soon as VMware fully supported it.

Upgrading-srm- (04).jpg

Figure 16.4 The vCenter 5 installer detects that other vCenter services are incompatible with it, and will also need upgrading.

Upgrading-srm- (05).jpg

Figure 16.5 vCenter will not only upgrade the vCenter software, but the database back end as well.

Upgrading-srm- (06).jpg

Figure 16.6 The vCenter upgrade process also upgrades the VPX agent on ESX 4.1 hosts, allowing them to be managed by the new vCenter.

Upgrading-srm- (07).jpg

Figure 16.7 My upgrade was triggered using the vcdbuser-nyc account that will use Windows Authentication to the SQL server, and will also be the service account for the vCenter Server Services.

At this stage you may receive an error stating “The fully qualified domain name cannot be resolved.” At the time of this writing, there is a small bug in the vSphere 5 release candidate that is generating this error message, and hopefully it will be resolved by the time of GA. I found that vCenter could indeed resolve the FQDN via both ping and nslookup tests, so I took this to be a benign error that would not affect the upgrade.

Next, you will be asked to set the path for both the vCenter Server software and the new inventory service. Once these paths are set, you can accept the defaults for the TCP port numbers for both the vCenter and Inventory Service, respectively. I’ve never had to change these defaults, but you may need to review them in light of your datacenter security practices. At the end of the upgrade wizard you can set the amount of memory to allocate to Java Virtual Machine (JVM) memory. This control was introduced in vSphere 4.1 and allows the administrator to scale the amount of RAM allocated to the JVM system. Select the radio button that reflects the scale of your environment; this parameter can be changed after the installation, if necessary (see Figure 16.8).

Finally, you will be asked if you want to “bump” (increase) the ephemeral ports value. This somewhat cryptic dialog box is all about the default settings for ephemeral ports used on Distributed vSwitches (DvSwitches). Ephemeral ports are a type of port group setting that allows a VM to grab a port on a DvSwitch, and then hand it back to the “pool” of ports when it is powered down. These ports are created on demand, and when used in the context of vCloud Director it can take time for them to be created and assigned if you have a significant number of vApps created simultaneously. “Bumping” the configuration allows vCloud Director to use ephemeral ports quickly, efficiently, and with scale. My general recommendation is that if you have no intention to deploy vCloud Director in your environment, just ignore the dialog box.

Upgrading-srm- (08).jpg

Figure 16.8 Since vCenter 4.1, the administrator can allocate a portion of memory to the JVM relative to the scale of the vCenter deployment.

Step 3: Upgrade the vCenter Client

After the upgrade of vCenter you will need to update the vSphere client. This can be triggered in two main ways: from the install media .iso file, or by merely logging on to the vCenter with an existing client and acknowledging the update message. I’ve had issues with the latter process in previous releases, mainly because I once canceled the update process midway through and my vSphere client was broken from that point onward. It is also possible to download the vSphere client from the vCenter Welcome Web page, but remember that the client is no longer on the ESX host locally. If you use an ESX host to download the client you will be downloading it via the Internet and VMware.com. The client isn’t a small .exe file; it’s 3 00MB in size. I think the upgrade of the vSphere client is trivial and isn’t worth documenting beyond the points I’ve made above. One final statement I would make is that if you are working in a hybrid environment of mixed versions of vSphere 4 and vSphere 5 you will end up with multiple versions of the vSphere client installed. If you think this is likely, it is a good idea to download the ThinApp version of the vSphere client from the VMware Flings website to reduce the number of client installs and updates you have to do. If you are licensed for VMware ThinApp or some other third-party application virtualization software, it’s a good idea to do that for the vSphere 5 client as well.

Step 4: Upgrade the VMware Update Manager (VUM)

It seems like a curious phrase—to update the Update Manager—but nonetheless you can upgrade the very tool that will allow you to upgrade the rest of your infrastructure, including ESX hosts, VMware Tools, and VMware hardware levels. There are a couple of changes around VUM that you should be aware of. First, VUM no longer supports the patching of VMs. In early versions, VUM would communicate to the popular Shavlik.com website as an independent source of patch updates for Windows and Linux guests. This feature has been deprecated and is no longer used, probably because patch management is often carried out with third-party tools that don’t care if Windows is physical or virtual. Historically, VUM only ever patched VMs, which is understandable given its remit. VUM allows for upgrades from ESX/ESXi 4.x to ESXi 5.0 only. It cannot be used to upgrade ESX/ESXi 3 hosts to ESX/ESXi 4 or ESXi 5.0. I think this is a classic case where if you were using a platform that was two generations behind (in this case, ESX 3) a clean install would be more appropriate than an upgrade. Finally, you should know that, unlike the vCenter 5 upgrade where no reboot is required at the end of the process, VUM requires a reboot. If, like me, you run your VUM instance on the same system as your vCenter, you will need to plan for a maintenance window. If you run VUM on a dedicated Windows instance, you should be able to tolerate this temporary outage. With that said, the reboot is not called by the installation, and I was able to ignore the reboot and it did not cause a problem.

When you run the installer for VMware VUM, just as with vCenter, the system will recognize that the previous installation is present (see Figure 16.9).

Upgrading-srm- (09).jpg

Figure 16.9 The install of Update Manager 5.0 detects the previous release of the product.

You will also be given some controls and a warning about the upgrade process with respect to what’s supported and unsupported (see Figure 16.10).

After this warning, you will need to provide the name, port number, and username/ password credentials to allow VUM to register with vCenter. In most cases you should find all the dialog box fields will be completed, and all you will need to supply is the password to authenticate with vCenter. VUM is still a 32-bit application that, although it can run a 64-bit platform, still requires a 32-bit DSN in the Microsoft ODBC configuration. Additionally, it does not officially support Windows Authentication to the Microsoft SQL server; as such you will need to provide the database password values during the upgrade (see Figure 16.11). In this case, you will find that the username field is already filled in from the previous installation of VUM on vSphere 4, and all you will need to provide is your password for the database user. As with the vCenter upgrade, you will be asked to confirm that you want to upgrade the database that backs VUM (see Figure 16.12).

Step 5: Upgrade the VUM Plug-in

Once the new version of VUM has been installed you should be able to install the new plug-in for it using the Plug-in Manager windows available from the Plug-ins menu. You might notice that the vCenter SRM 4.1 plug-in is listed there as well (see Figure 16.13). You can ignore this plug-in as we will not need it in our upgrade from SRM 4.1 to SRM 5.0.

Upgrading-srm- (10).jpg

Figure 16.10 Update Manager explains the new support rules. Critically, VMware has deprecated patching of virtual machines.

Upgrading-srm- (11).jpg

Figure 16.11 Update Manager only supports SQL Authentication and requires a 32-bit DSN, but it will only install to a 64-bit edition of Windows.

Upgrading-srm- (12).jpg

Figure 16.12 VUM’s back-end database is upgraded. Back up the VUM database and take a server snapshot if you run VUM and SQL in a VM.

Upgrading-srm- (13).jpg

Figure 16.13 As vCenter and VUM are upgraded, new client plug-ins are required, and this is also true of SRM.

Step 6: Upgrade Third-Party Plug-ins (Optional)

As a regular user of the vSphere client I make extensive use of third-party plug-ins from many vendors. It’s highly likely that the old vCenter 4 plug-ins will not be certified for use with vCenter 5, and therefore they are most likely due for an upgrade. My top plug-ins come from my main storage vendors that I use on a daily basis, including Dell, EMC, and NetApp. With that said, I don’t consider these plug-ins to be a vital or essential part of my upgrade process. In short, they are nice to have rather than must-haves. That’s why I consider this step as being “optional” as it could be deferred to the end of the process. Nonetheless, if you are not in a hurry, now would be a good time to review these plug-ins given that we are handing over extensions to vCenter from VMware.

Step 7: Upgrade the ESX Hosts

From a technical standpoint, SRM 5.0 has no dependencies from an ESX version perspective. It will work with ESX 4.1 and ESX 5.0 added into the same vCenter. Of course, you would need to be careful not to mix ESX 4 and ESX 5 in the same cluster because they are incompatible with each other. But if your goal is to upgrade SRM and get to a functional state such that you can carry out failovers and failbacks at will you could actually bypass this step. However, historically VMware has always placed this stage after the upgrade of the management system, and it’s for this reason that I placed this step in this location. One anomaly I have seen is when an ESX 4.1 host is added to vCenter 4 before its iSCSI configuration has been set up. I found I could not even see the iSCSI software adapter from vCenter 5. Instead, I had to open the vSphere client directly on the host, and configure it there. This was an experience I had in the early beta and release candidates, which I assume will be resolved by the time of the GA.

It’s worth saying that VMware’s VUM is not your only option for an upgrade process. It is possible to have an interactive upgrade from ESX 4 to ESX 5 using the ESX 5 DVD installer. There is also the option of using the esxcli utility to upgrade using an offline patch bundle, which might appeal to you if you need to perform so-called “headless” upgrades. Finally, there’s an option to use the new Auto Deploy functionality that loads ESXi across the network into memory; of course, this isn’t really an upgrade process but a redeploy process. I will focus on the VUM method as I believe this is the most nonintrusive method from the perspective of maximizing the uptime of your VMs, and reducing the steps required to get to a new version of ESX while retaining the existing configuration as much as possible. Before you begin with an upgrade you might like to carry out another scan of your ESX 4.1 host to make sure it is patched to the hilt, and fully compliant from a VUM baseline perspective. As you can see, I have two ESX 4.1 hosts being managed by vCenter 5 that are fully compliant after having the default Critical Host Patches and Non-Critical Host Patches assigned to them (see Figure 16.14).

The first stage in the upgrade process is to import the ESX 5.0 image into the Update Manager repository. You can do this by using the Import ESXi Image link in the ESXi Images tab (see Figure 16.15).

In the wizard you can browse for the .iso image of the ESX 5.0 build. After you select it, it will be imported into the VUM storage location (see Figure 16.16), and you will be prompted to create a baseline. After completing this import process you should have an ESX 5.0 build in the ESXi Images tab, as well as an upgrade baseline in the Baselines and Groups tab. A baseline is merely a collection of agreed-upon patches that can be used as a barometer of whether a host is compliant when scanned, or as the list of patches to be applied during the remediation process (see Figure 16.17).

Upgrading-srm- (14).jpg

Figure 16.14 A successful scan of two ESXi 4 hosts using vCenter 5 and VUM 5, indicating they are fully patched prior to the upgrade

Upgrading-srm- (15).jpg

Figure 16.15 Using the Import ESXi Image link to bring the ESXi 5 image into the VUM scope

Upgrading-srm- (16).jpg

Figure 16.16 The progress bar showing the import of the ESXi image held within the DVD .iso file

The next stage is to attach the baseline to the ESX host or a cluster (see Figure 16.18). You do this by switching to the Hosts & Clusters view, selecting the Update Manager tab, and using the Attach link to assign the desired baseline. The VUM baseline that contains the requirement for a host to be running on ESXi 5 can be attached to a cluster. The administrator can then ask VUM to scan the host to check its compliance, and then issue a “remediate” command.

Upgrading-srm- (17).jpg

Figure 16.17 Once the import is successful a VUM baseline can be created.

Upgrading-srm- (18).jpg

Figure 16.18 Attaching the baseline to the ESX host or a cluster

If a scan was carried out at this stage the status of the ESX hosts would change from green to yellow, and the hosts would be marked as “incompatible” with the baseline from an upgrade perspective (see Figure 16.19). This is because we have yet to actually push out the upgrade to the hosts.

The remediation process comes in two parts: a stage process and a remediate process (see Figure 16.20). The stage process pushes the update down to the host but does not trigger the remediation, whereas the remediation process both pushes the update down to the host and triggers the remediation process at the same time. I prefer to stage up the hosts, especially when I have a lot of them. I also like the confidence that successful staging of the hosts gives me, as I feel the “all or nothing” approach of just clicking the Remediate button is somewhat risky. Additionally, it means I can keep the ESX hosts up for the maximum time, using the current uptime to drive the update down to the ESX hosts, before requesting that the update be applied.

Upgrading-srm- (19).jpg

Figure 16.19 After a scan using an ESXi 5 baseline, the ESXi 4 hosts are shown to be incompatible, and in need of remediation.

Upgrading-srm- (20).jpg

Figure 16.20 Prestaging an upgrade or patch process allows the host to be operational for the longest period of time.

Before you carry out a remediation, you should confirm a few features: first, whether vMotion is working successfully, and second, whether maintenance mode executes without errors. VUM relies heavily on these two features. If they fail for any reason, you will find that the remediation process will stall, and you will be required to resolve that problem manually. Third, I recommend setting your DRS cluster to be fully automated as this will allow the remediation process to continue without waiting for an administrator to be involved. When I run the remediation process I generally start with just one ESX host within the cluster, and I confirm that the upgrade process works for one host. In a small cluster I will tend to do the remediation on a host-by-host basis, especially if the cluster is heavily loaded and cannot tolerate a huge amount of maintenance mode activity. In large clusters, once I have one ESX upgraded, I will tend to run the remediation on the entire cluster. Whether you do the remediation on a host-by-host or a cluster-by-cluster basis a lot depends on how much confidence you have in Update Manager, and the time you have. When you click the Remediate button, you must select the right type of baseline to apply (in our case, Upgrade Baseline) and the baseline you created at the end of the ESX 5.0 image import process (see Figure 16.21).

A significant number of settings appear in the wizard (see Figure 16.22), and I feel most of these fall squarely into the zone of “common sense,” but I would like to draw your attention to a couple of options that could greatly increase the success of the upgrade process. First, I recommend enabling the option labeled “Disable any removable media devices connected to the virtual machines on the host” in the Maintenance Mode page. Connected floppy and DVD/CD-ROM devices are known to cause problems with vMotion, and as a consequence, with maintenance mode as well. Additionally, I would enable as many of the options within the Cluster Remediation Options section as possible.

Upgrading-srm- (21).jpg

Figure 16.21 The Remediation Selection Wizard where the upgrade baseline created earlier can be applied

Upgrading-srm- (22).jpg

Figure 16.22 VUM supports many options. The important factor is how many simultaneous remediations can be carried out.

Let me take each of these in turn and explain how I think they could improve the experience of using VUM. I will assign each option a number, with the first option (Disable Distributed Power Management…) as #1 and the last option (Migrate powered off and suspended virtual machines…) as #5, so that I do not have to reproduce the long descriptions of each option.


1. DPM puts ESX hosts in a powered-off or “standby” state when it deems you have more hosts powered on than are really necessary. The more hosts you have online during the remediation the more “slots” exist for moving VMs off the ESX hosts. The net effect should be to reduce the strain the cluster is put under with hosts going in and coming out of maintenance mode.

2. It is possible for ESX to refuse to enter maintenance mode due to a general lack of resources in the cluster. HA has strict admission control rules that prevent VMs from being powered on if it thinks there are insufficient resources in terms of capacity.

You will see this with small clusters that are heavily loaded which, for various reasons, have calculated a large slot size. Turning this option on relaxes admission control and makes it more likely that the VMs will be evacuated from the host in a graceful manner through a combination of vMotion and maintenance mode, all orchestrated by Update Manager.

3. Temporarily disabling FT during the remediation process does introduce a slight risk: What if the host where the primary FT VM was running failed? There would be no FT protection for that VM. However, by turning off FT you increase the number of possible targets for vMotion, since the primary and secondary pairs in FT can never be on the same host.

4. I think this setting is common sense. The more simultaneous remediation you can carry out the faster the whole process is completed. Of course, on my two-node cluster the setting is rather meaningless, because I simply don’t have enough physical hosts in my lab to configure my preferred minimum—at least three ESX hosts or more in any VMware cluster are necessary.

5. This option will ensure that VMs are located on other ESX hosts during the remediation. The same effect can be achieved by enabling the Fully Automated setting on a DRS cluster.

Once the remediation has completed you will need to relicense the host; until you do this it will be in evaluation mode.

Upgrading Site Recovery Manager

After all that, we are finally in a position to carry out an upgrade of SRM. We could actually continue with the vSphere 5 upgrade process if we wished, but as this book is essentially about SRM I want to get into the SRM upgrade as soon as possible. If you have gotten this far in the upgrade process, you might like to consider taking a snapshot of your environment. I’ve been doing that all along in the process. The actual upgrade of SRM 5.0 is more akin to a clean install than an in-place upgrade. The only real difference is that you have to establish a database or DSN configuration prior to running the setup. During the upgrade the installer will preserve your configuration held in the vmware-dr.xml file, which can be imported back into the system once the repair and array configuration steps have been carried out. Before embarking on the upgrade I recommend manually backing up any scripts and .csv files you have to another location as an additional precaution.

Step 8: Upgrade SRM

During the installation you will be asked to reinput the vCenter user account and password used to register the SRM extensions. At this point the SRM 5.0 installer will recognize the previous registration made by SRM 4.1; to continue you will need to click the Yes button, to overwrite the existing registration (see Figure 16.23).

Alongside this authentication process you will need to resupply the original SRM database user account and password. As with vCenter and VUM, successful authentication will result in SRM 5.0 discovering an existing SRM database, and you will get the opportunity to upgrade it to be compatible with the new release (see Figure 16.24). Despite the lack of a warning about backing up the SRM database which we saw previously, again I recommend a backup and snapshot of the SRM server and SQL systems where possible.

At the end of the upgrade you will be given a prompt indicating the next stages: updating the storage configuration and repairing the sites. The storage configuration option essentially opens a Web page that is a report of the array configuration prior to the upgrade (see Figure 16.25). In this case, I used NetApp as the storage vendor.

Upgrading-srm- (23).jpg

Figure 16.23 The SRM 5.0 installer recognizes a previous extension based on SRM 4.1, and lets you overwrite it with the newer version.

Upgrading-srm- (24).jpg

Figure 16.24 As with vCenter and VUM, the SRM installer can upgrade the database.

Upgrading-srm- (25).jpg

Figure 16.25 Post-upgrade information reports on the current SRA configuration

Clearly, using an SRA from the previous release that lacks the advanced functionality that SRM 5.0 offers isn’t going to get us very far. So the next stage is to uninstall all the SRAs that were deployed with SRM 4.1 and install the new versions that ship with SRM 5.0. I had variable success with uninstalling SRAs; some would uninstall and some would not. Of course, I did not have the opportunity to install every SRA and check the uninstall routine. But I generally recommend the removal of old software before installing the new software to prevent any situation where SRM accidentally discovers an old and out-of-date SRA.

Once the SRA software has been upgraded we are in a position to install the new SRM plug-in, and complete the upgrade. Occasionally, I’ve found that the Plug-ins Manager allows me to download the SRM 5.0 plug-in. It is possible to bypass the download and install link in the Plug-ins Manager, and download and install the plug-in directly from the SRM 5.0 server using this URL:

http://srm4nyc:8096/VMware-srmplugin.exe

A word of warning here: At first sight, it will look as though the upgrade has completely clobbered the original configuration. Even after repairing the site and configuring the array managers, the Recovery Plans and Protection Groups do not auto-magically appear in SRM. Instead, they have to be restored from the backup taken by SRM 5.0 during the upgrade. Installing the SRM 5.0 plug-in, re-pairing the site, and creating an array manager configuration are all topics I covered in Chapter 7, Installing VMware SRM, and I don’t feel it would be beneficial to repeat that again here. Different vendors have different approaches to managing the SRA component, but most require an uninstall of their old SRA prior to installation of the new SRA. I heartily recommend that once the SRA has been configured you confirm that it returns all the LUNs/volumes you expect to be replicated. Occasionally, I’ve had to “refresh” the SRA to ensure that all volumes are marked as being replicated.

Once the pairing and array configuration is in place you can open a command prompt on the SRM server. It doesn’t matter which SRM server you connect to run the srm-migrate command. To run the following command you will need to be in the C:\Program Files (x86) \VMware \VMware vCenter Site Recovery Manager \bin path:

srm-migration.exe -cmd importConfig -cfg ..\config\vmware-dr.xml -lcl-usr corp\administrator -rem-usr corp\administrator

The srm-migrate command also supports –lcl-csv-file and –rem-csv-file switches that allow you to import the dr-ip-exporter files from the SRM 4.1 release. As you can see, the srm-migration tool requires you to supply the username of the SRM administrator account of the SRM server you connect to, to run the command (- lcl-user), as well as the account of the remote SRM server (-rem-usr). The utility then completes quite lengthy output as it re-creates the Protection Groups and Recovery Plans. In my tests, running this command stopped the SRM service on the SRM server on which it was run, and I needed to restart the service in order to connect.

Step 9: Upgrade VMware Tools (Optional)

WARNING: The upgrade of VMware Tools and virtual hardware has been known to create a brand-new local area connection within Windows. This results in the “loss” of your static IP addresses as the new network interface defaults to using DHCP. Ensure that you back up your VM and make a note of its current IP configuration before embarking on this step of the upgrade process. I found with Windows 2008 R2 64-bit an update of VMware Tools and virtual hardware did not remove my NIC settings. A VMware Tools upgrade initiated via Update Manager does trigger a reboot.

After an upgrade of ESX from version 4.1 to 5.0 you will find that your VM’s internal VMware Tools package will be out of date. To determine which VMs need their tools updated, from vCenter you can add an additional column to the Virtual Machines tab, called VMware Tools Status (see Figure 16.26). You can then sort it to display all the VMs whose VMware Tools package is out of date.

Alternatively, if you prefer you can use VMware PowerCLI to see a definitive report of the VMs whose VMware Tools are in a troublesome state, such as “Not running” or “Out of date.” Although it’s a bit torturous to look at, this PowerCLI one-liner will produce the report:

Get-VM | Get-View | Select-Object @{N="Name";E={$_.Name}},@{Name="ToolsStatus";E={$_.Guest.ToolsStatus}}

Upgrading-srm- (26).jpg

Figure 16.26 This tab shows that some VMs are in need of a VMware Tools upgrade.

This report also returns information about your placeholder VMs, so you may wish to filter out those with your PowerCLI commands.

You might find some VMs have already had their VMware Tools automatically updated. This is because toward the end of vSphere 4, VMware introduced new settings that enable the VM to check its tools status on boot-up, and if it discovers its VMware Tools are out of date, to automatically upgrade them (see Figure 16.27).

This setting is not a default for all VMs, but fortunately VUM can assist in enabling it for large numbers of VMs, with the need to resort to PowerCLI. In the VMs and Templates view on the Update Manager tab of the vCenter, datacenter, or folder of VMs, you should see a “VMware Tools upgrade settings” button next to the Remediate button (see Figure 16.28).

Upgrading-srm- (27).jpg

Figure 16.27 It is possible to have the status of VMware Tools checked on next power on and upgraded if found to be out of date.

Upgrading-srm- (28).jpg

Figure 16.28 The “VMware Tools upgrade settings” button allows you to control how VMware Tools is upgraded.

Clicking this button will bring up a dialog box that will allow you to select which VMs will have this option enabled, including templates.

Ideally, any upgrade should avoid the need to shut down VMs, so it’s likely that the vast majority of your VMs will not be up to date.

It’s possible to upgrade VMware Tools via many methods.

•Individually, by right-clicking a VM, selecting Guest, and choosing Install/Upgrade VMware Tools. This option allows for an interactive or automatic tools upgrade.

•In bulk, using the preceding method if you multiselect affected VMs in the Virtual Machines tab.

•Via PowerCLI, through the Update Tools option which will apply the VMware Tools update package to nominated VMs, likes so:

Get-ResourcePool Web | Get-VM | Update-Tools –NoReboot

•Via Update Manager, using the same method we used to upgrade the ESX hosts with VMware Update.

Different people in the VMware community have different approaches to the VMware Tools update. Some use the “Check and Upgrade Tools during power cycling” option, and just assume that over a period of time VMware Tools will eventually be updated, as part of a normal patch management schedule that frequently will require a reboot anyway as part of the process. Others prefer the PowerCLI method because it allows them to use very tailored filters to apply the update to the right VMs at the right time; critically, it also allows the administrator to suppress the default reboot of VMware Tools to Windows guests. Yet other administrators prefer the consistency of using one management system— in our case, VMware Update Manager—to manage the entire process. Whatever you prefer, there are some important caveats to remember. For instance, if you are using the PowerCLI method, the –NoReboot option when using Update Tools in PowerCLI only applies to Windows guest operating systems. There is no 100% guarantee that the reboot will not happen even if you use the –NoReboot option, though; a lot depends on the version of the tools installed, and the version of ESX/vCenter you are using. Additionally, in a non-Windows-based operating system you may find that all Update Tools does is mount the VMware Tools .iso file to the guest operating system, leaving it to you to extract the .tgz file or install the .rpm file in the case of some Linux distributions. If this is the case you might prefer to use your own in-house methods of pushing out software updates, instead of using VMware’s. I prefer using VMware Update Manager as it has a sophisticated way of applying the update, while at the same time offering the VMware administrator a method to validate whether the update was successful.

Step 10: Upgrade Virtual Hardware (Optional)

Alongside a VMware Tools update you may also want to carry out a virtual machine hardware upgrade. You will be pleased to learn that upgrading the VM hardware doesn’t require any anti-static discharge precautions or run the risk of losing fiddly screws down the crevices of badly finished cases that cut you to ribbons! VMware has its own unique version system for VM hardware levels that increment across the various virtualization platforms it produces (Workstation, Fusion, Player, and ESX). ESXi 5 raises the VMware hardware level from version 7 (also supported on ESX 4) to VM version 8. You can see these hardware levels in the vSphere client when you create a brand-new VM using the custom options (see Figure 16.29).

You will find the VM version using the same changes to the GUI I showed in the previous section. The Virtual Machines tab also supports the addition of a VM Version column (see Figure 16.30).

Upgrading-srm- (29).jpg

Figure 16.29 Upgrading the VM version requires a power off of the VM. Remember, a VM with hardware level 8 cannot run on an ESX4 host.

Upgrading-srm- (30).jpg

Figure 16.30 The VM Version column indicates the hardware level of the VM.

Again there’s is a PowerCLI “one-liner” that will list all your VMs with their VM version, like so:

Get-VM | ForEach-Object {Get-View $_. ID} | ForEach-Object {Write-Host $_.Name $_. Config.Version} | sort

Unlike VMware Tools upgrades where a reboot can sometimes be suppressed or deferred, virtual hardware upgrades require a reboot. Additionally, these upgrades are nonreversible, so once a VM is upgraded from VM version 7 to VM version 8 it will no longer run on ESX 4.0. Therefore, the decision to upgrade the virtual machine hardware level is not to be undertaken lightly if you plan to undertake your upgrade process in a gradual and stealthy manner where you are running a hybrid model in which vCenter 5 manages both ESX 4 and ESX 5 clusters. I like the idea of doing the VMware Tools and hardware level upgrade in one fell swoop. It gets the whole unpleasant business out of the way, and again I prefer to do this via VUM. However, as with VMware Tools there are a number of different methods for you to choose from.

• Individually, by right-clicking a VM when it is powered off and selecting the Upgrade Virtual Hardware option.

• In bulk, by multiselecting affected VMs in the Virtual Machines tab. Remember, the selected VMs must be powered off first.

• Via PowerCLI, using the following command:

Get-VM web* | Get-View | % { $_.UpgradeVM($null) }

I recommend looking at these scripts from Arne Fokkema’s ict-freak.nl site. Although these were developed from the perspective of an upgrade from ESX 3 to ESX 4, there is no reason they could not be used for an upgrade from ESX 4 to ESX 5:

http://ict-freak.nl/2009/06/27/powercli-upgrading-vhardware-to-vsphere-part-1-templates/

http://ict-freak.nl/2009/07/15/powercli-upgrading-vhardware-to-vsphere-part-2-vms/

• Via Update Manager, using the same method we used to upgrade the ESX hosts with VMware Update.

As I stated earlier, I prefer to use Update Manager, as I think it’s nice to be consistent with other methods used. By default, VUM ships with two built-in baselines: one for upgrading VMware Tools and one for upgrading the VM version. This can be seen under the Baselines and Groups tab by clicking the VMs/VAs button (see Figure 16.3 1).

Upgrading-srm- (31).jpg

Figure 16.31 While there are many ways to upgrade both VMware Tools and virtual hardware, I prefer to use VUM for consistency.

These baselines can be attached to any object in the VMs & Templates view. However, there’s one big downside. Although multiple baselines can be attached to the VMs, only one can be applied to VMs at any one time (see Figure 16.32).

Step 11: Upgrade VMFS Volumes (Optional)

As you probably know, VMFS has been updated from version 3 to version 5. The last time there was an upgrade to the VMware file system was when people transitioned from ESX 2 to ESX 3. When ESX 4 was released VMware chose not to change the file system and this offered an easy upgrade path from one release to another. Changing a file system is a big undertaking for any vendor and it’s not unusual for this to be done only when absolutely necessary. Fortunately, our lives will be made easier by the fact that VMware has made it possible to upgrade VMFS even when there are VMs that are powered on. Previous upgrades require Storage vMotion or a power off of all VMs to unlock the file system ready for an upgrade.

Upgrading-srm- (32).jpg

Figure 16.32 Sadly, it appears that one cannot simultaneously upgrade both the VMware Tools and virtual hardware in one fell swoop.

In an ideal world you would create a brand-new LUN/volume and format that cleanly with VMFS-5, the reason being that although VMFS-3 volumes can be upgraded to VMFS-5, doing this preserves some of the original limitations surrounding VMFS-3. I think Cormac Hogan of VMware has one of the best roundups of VMFS-5 and the upgrade process online:

http://blogs.vmware.com/vsphere/2011/07/new-vsphere-50-storage-features-part-1-vmfs-5.html

Here is a list of some of the old properties that are still present when an in-place upgrade has been made of VMFS.

• VMFS-5 upgraded from VMFS-3 continues to use the previous file block size which may be larger than the unified 1MB file block size.

• VMFS-5 upgraded from VMFS-3 continues to use 64KB sub-blocks and not new 8K sub-blocks.

• VMFS-5 upgraded from VMFS-3 continues to have a file limit of 30,720 rather than a new file limit of > 100,000 for the newly created VMFS-5.

• VMFS-5 upgraded from VMFS-3 continues to use the Master Boot Record (MBR) partition type; when the VMFS-5 volume grows larger than 2TB, it automatically and seamlessly switches from MBR to GUID Partition Table (GPT) with no impact to the running VMs.

• VMFS-5 upgraded from VMFS-3 continues to have its partition starting on sector 128; newly created VMFS-5 partitions will have their partition starting at sector 2048.

Whether these “lost” features are significant to you is moot. For me the big change is the move away from MBR to GPT. It opens the door to allow VMware to have virtual disks and RDMs that break through the 2TB size barrier. It’s worth remembering that currently in ESX 5, despite the improvements in the VMFS file system and VMkernel, the maximum size of any .vmdk file and RDM is still only 2TB. This is in spite of the fact that many modern guest operating systems can address single partitions that are much larger than this, with them moving to GPT some time ago.

So, what is the alternative if you feel a desperate urge to have your VMFS partitions start at sector 2048 rather than 128? Well, you could create a new LUN/volume, format it natively with VMFS-5, and then use Storage vMotion to move your VMFS from the old file system to the new one. Depending on the size of the VMs and the performance of your storage array, that could take some time. This new LUN/volume would have to be configured for replication if it was to be used by Site Recovery Manager. It’s perhaps worth remembering that SRM doesn’t really handle Storage vMotion very seamlessly.

If I was going to embark on this approach I would make my two datastores part of the same Protection Group. In this configuration, Storage vMotion works like a charm. Nonetheless, the extra administrator time to complete the process is something you might want to balance against the perceived “benefits” of cleanly installed VMFS volumes. Of course, customers who use NFS don’t have to worry about this issue in the slightest—a fact that has not escaped the attention of many of my customers. To carry out an in-place upgrade of the VMFS file system, locate the VMFS volume in the inventory and click the link to Upgrade to VMFS-5 (see Figure 16.33).

Step 12: Upgrade Distributed vSwitches (Optional)

vSphere 5 introduces enhancements to the VMware Distributed vSwitch, and it is possible to upgrade to the new version 5 switch (see Figure 16.34). The upgrade process is seamless to the vSphere and does not affect virtual machines. To carry out the upgrade, switch to the Networking view, select the DvSwitch, and click the Upgrade link.

Upgrading-srm- (33).jpg

Figure 16.33 The Upgrade to VMFS-5 link on the properties of a VMFS volume

Upgrading-srm- (34).jpg

Figure 16.34 vSphere 5 introduces a new distributed virtual switch format.

Summary

This concludes the upgrade process, but of course, given your use of VMware technologies, the upgrade process might not end here. For example, you may need to update your VMware View deployment as well. As you can see, the upgrade process is a highly automated and relatively smooth process, but care should be taken to minimize downtime wherever possible. As virtualization platforms increasingly mature, it is possible to upgrade the vSphere and associated layers seamlessly without much interruption to the guest operating systems that provide the applications and service upon which your users depend. I think VMware has done a good job of planning and staging the vSphere 5 aligned releases so that there is not a large lead-in time between the platform release (vSphere 5) and the associated add-ons like SRM that complement it.

Well, this is the end of the book, and I would like to use these last few paragraphs to make some final conclusions and observations about VMware Site Recovery Manager and VMware generally. I first started working with VMware products in late 2003. In fact, it wasn’t until 2004 that I seriously became involved with VMware ESX and VirtualCenter. So I see that we are all on a huge learning curve because even our so-called experts, gurus, and evangelists are relatively new to virtualization. But as ever in our industry, there are some extremely sharp people who work out in the field who reacted brilliantly to the seismic shift that I saw coming when I saw my first vMotion demonstration.

There’s been a lot of talk about how hypervisors will or are becoming a commodity. I still think we are a little bit away from that, as VMware licensing shows—there is still a premium to be charged on the virtualization layer. The marketplace is forever changing and VMware’s competitors will try to catch up, but I think that will take much longer than many pundits think. These pundits don’t realize that VMware isn’t going to stay still while others advance. Companies thrive when they have a market to either create or defend. As the virtualization layer becomes increasingly commoditized, for me that means management is now where the money has moved to, and SRM is firmly in that camp.

But I see another shift that is equally seismic and that is a revolution in our management tools because, quite simply, the old management tools simply don’t cut the mustard. They aren’t VM-aware. VMware is creating these VM-aware products (Site Recovery Manager, vCloud Director View, and others) now, not in some far-flung future. So, if you are a VMware shop, don’t wait around—get on and play with these technologies now, as I have done, because they are “the next big thing” that you have been looking for in your career. As the years roll by, expect to see the R in SRM disappear—and with the advent of cloud computing, VMware Site Manager will be as much a cloud management tool as it is a disaster recovery tool. I imagine Site Manager will also integrate seamlessly with the new developments in long-distance vMotion, allowing you to orchestrate the planned move of VMs from one site to another without the VMs being powered off. In the future, I can imagine using the technology that currently sits behind SRM being used to move virtual machines from an internal private cloud to an external public cloud—and from one cloud provider to another.