Homer-Simpson.jpg

For the last couple of weeks I’ve been tussling with a problem with my vCloud Director build. It turn it was it was ALL my fault after all. So in effort of being candid, and hopefully helping someone else fix/avoid what I regard as “schoolboy error” here’s my strange and terrible saga…

One thing that occasionally happened with my vCD cell is that appeared to become unresponsive for no particular reasons. Symptoms would be browser messages about the system not responding… the constant reappearance of the “untrusted certificate” connection in Firefox (even thought I had added in many, many times before).

vcdunresponsive.png

untrustedcertmessage.png

Anyway, I was using my various contacts – and our internal SocialCast to try get to the bottom why this was happening – it was so random. And I was restarting clients and the vCD cell itself in desperate bid to get it backup. I was even paraniodly thinking that somehow the cell had some sort of SSL error, and was regenerating the self-signed certificate or something. Not to be defeated I took at look at the certificate name, and dates. There was nothing untoward there. I guess that thinking was triggered by some early SSO/SSL issues I had between vCD and the vCenter – which got resolved with the 5.1.1 builds.

Screen Shot 2012-12-06 at 08.24.43.png

Ah ha. I thought perhaps if I made the vCD name be the same as this certificate that might fix things. Wrong! But this think did lead me to the solution. I toddled off to my DNS server to add either C-name or A-name record for “mycloud”. And that’s where I saw the error of my way. The thoughts came into my head. Firstly, what a pigging mess my DNS is in my lab with stale/old records, and secondly the fact my HTTP/S IP (192.168.3.134) and my vCD-Console IP (192.168.3.135) had the same name.

Screen Shot 2012-12-06 at 08.38.36.png

What was I thinking when I did this. Two A-NAME records of the same name to two different IP address is going to result in a DNS-Round-Robin circa how we did load-balancing of web-servers in the 90s. Every time the DNS cache was dropped on a client, and a query sent I would get the next IP address in the bundle. What I should have done was registered TWO different names in DNS (and incidentally two different certificate requests). Basically, I didn’t engage the brain. So what I did was delete the vcd01 – 192.168.3.134 entry – and replace it with a mycloud – 192.168.3.134 entry – and then all was right in the world.

There’s a couple of teaching points here. I’m using the vCD appliance, and doubted the appliance. I thought that if removed the appliance and use the installable version of vCD that my problem would go away. Wrong. The problem was my own making through my stupid configuration. Secondly, appliance wonderful as they are – do require post-configuration once they have been deployed (forgetting to configure NTP, Syslog and correct timezones is another classic mistake), and that’s as only as good as the operator who does that task.