Part 17: My vCloud Journey Journal – Is my Storage/Network right?
A couple of weeks ago a wrote a post about how divvied up my storage and networking ready for vCloud Director. At the time I was smug with satisfaction that I’d done it the right way… and then doubts started to creep in. As you might recall I have two clusters (Gold/Silver). Gold has more servers/memory than the Silver (despite having older less functional CPUs as silver), and is connected by 2gb fibre-channel to SAS based storage (replicated every 5mins). So on the compute level at least – I was happy with the configuration.
On the storage side I’d crudely craved it up into 4-tiers – and presented all the storage to all the clusters where possible (the Silver cluster doesn’t have FC-Cards so there’s no possibility of presenting all the storage to all the cluster – or resource groups as they sometimes referred to in vCD literature).
I’d take much the same approach with the Distributed vSwitches – I’d created two – one for “infrastructure” management (vSphere hosts, vMotion, FT, IP-Storage, Management etc), and separate DvSwitch for the Organizational Virtual Datacenters (the networking needed by my tenants). Where every DvSwitch was available to every cluster
On paper there’s nothing “wrong” as such with this configuration. But I was worried that it wasn’t “realistic” enough. The truth is in the world of virtualization we have often silo’d virtual resources along side the physical. In fact the very “silo” making I’ve often argued against, is actually quite common amongst virtualization folks. I guess old habits die hard. This what I mean.
Often a VMware Cluster is configured by folks as not only a discrete block of compute/RAM power, but as discrete storage and networking. Put simply ClusterA might contain 16 hosts with access to VLAN10-20 and LUN10-20 whereas ClusterB would contain 20 hosts with access to VLAN30-40 and LUN30-40. That’s often cause problems when folks wanted to move a VM from ClusterA to ClusterB. The common workaround is to present some kind of “swing” LUN/volume (I think NFS is good candidate here, as its so easy to present & unpresent as needed) as method of moving VMs from Cluster to another OR you could say one silo to another.
In short I was worried my configuration wasn’t “silo’d” enough to make it match up to the popular ways that vSphere is configured. As ever you always striving for realism in a “lab” environment and often because resources are scarce you fall short. But it this case it was as much to do with the way I had constructed the vSphere layer.
There was also another concern as well. With my Provider Virtual Datacenters (the type of vDC that points to vSphere cluster and its resources (hosts, networks and storage) I’d given both Gold & Silver access to all the storage I physically could. That meant that Gold had access to Tier1-4 and Silver had access to Tier2-4. It occurred to me that this might not really very desirable. I could make the offering much less flexible – so Gold still had access to Tier1-4 and but Silver only had access to Tier3-4. The theory was I could stop someone using the Silver Provider Datacenter from access the most expensive storage in the environment. The other theory was Gold consumers could still have the CPU/Memory horsepower they needed, but if the application they had wasn’t Disk IOPS intensive they could opt to use the Tier4_Bronze category of storage (iSCSI/SATA drives with no replication).
Of course similiar arguments could made around the networking. Shouldn’t Gold have it own DvSwitch, seperate from the Silver DvSwitch…?
All of this has got me thinking about the process of making silos. I’ve made very long and detail arguments against the siloing of resources and siloing of expertise – is leading to bottlenecks and lack of flexibility in the environment. But I’m increasingly trying to think in another way. It comes from my experience of arguing about politics [stick with me here, okay]. For many years I’ve wasted my breathe trying to promote my own political views. It’s only recently that I’ve started to question them seriously – and I’m doing this by trying to understand why opponents think why they do. I’ve found I’ve made more progress in my thinking in the last two years, by questioning my opinions than either did in the 20 years I spent defending them. I think the same applies to this siloing question. I’ve chosen to put myself on the anti-silo camp – because I want to see more flexibility and dynamism in what we do – rather than it being mired in physical barriers and political barriers – the infighting between teams within CorpIT that are so often the brake on change.
But what if I wasn’t anti-silo? And instead ask genuinely the question of why siloing occurs, and ask – what is it that about silo’s that people like? After all the must be something good about the silo for people to adopt and use it so extensively in their infrastructures? Here’s some reasons. We like silos because they are very controlled. We like control. We like it not because we are deranged and anally retentive (although many people who work in IT can be like that – you know who I mean!). We love silos because if problem occurs it limits the performance problems or uptime that can damage the operational integrity of the systems we support. We love silos because we can make changes within the context of the silo – without that affecting or cascading to other systems. We love silos because mentally the brain can conceptualize a discrete set of relationships between relatively desperately managed systems. However…
For all the hot-air about “single pane of glass” the reality is that with virtualization touching so many different resource layers you inevitable end-up interfacing with many management tools – unless your fortunate to run on one of the more modern “stacks” which are sold as servers/storage/network packages in a box. I will give you an example. Yesterday I added “esx05nyc” to my Gold cluster. I added a new dual-port card to the server as well as new FC-card. To make that system work I had to use 3 systems…
1. The vSphere Web-client (to add the host to the cluster)
2. The FC-Switch Management Tool (the FC card introduced a new WWN, so needed to be added into my zone configuration)
3. NetApp System Manager (the card’s new WWN needed to be added to the initiator group)
To be honest its not like I do this every day – and I amazed that after all this work – the LUN appeared. I was pretty proud of myself I can tell you [Yes, I know this is incredibly sad of me, but what can I say…]
I strikes me we want all the separation that siloing brings without the heart-ache and hassle that comes along with it. You could say that the concept of the Virtual Datacenter (or the software-defined datacenter) is an attempt to have our cake and eat it. To carry on having the separation that we need to manage resources and enforce security & compliance – but rather than having those hard-coded in hard-ware, they are soft-coded to soft-ware – and therefore more receptive to change and reconfiguration as the business demands change.
As for me – I’m going to stick with my non-silo’d approach for storage/networking. Already I’ve finding its making my life easier. There’s no need to for me have “swing” LUNs. If I need to move (within vSphere) a VM from one cluster to another. I just power it down (Gold=AMD, and Silver=INTEL) and drag-and-drop. The process completes in seconds and is almost easy as doing a VMotion within a cluster – unless of course, that VM resides on Teir1-Platinum Storage which is physically no accessible to the Silver cluster. In that case a Storage VMotion is triggered. You see sometimes the physical world just won’t leave you alone….