A lot of my work at 3Sharp involves working with virtual machines. (And when I say "a lot" I really mean "the majority.") When it comes to working with virtual machines, I have a choice: I can use Virtual PC or Virtual Server. Non-Microsoft virtualization environments offer some nice features, but I usually can't use them for some reason or other. Given my choices, I tend to stick with Virtual Server; it isn't quite as user-friendly in some options as Virtual PC, but it doesn't have some of the brain-dead limitations either.
Today, I found a new way in which Virtual Server helps break machines, at least if you operate it in PEBCAK[1] mode. Let me 'splain. I had to deploy a new test lab environment consisting of six virtual machines configured as a single-domain Active Directory forest with a DNS name of contoso.demo and a NetBIOS name of CONTOSO. The only spare machines that we had to run these virtual machines, which are fairly resource-intensive, were a pair of 64-bit servers hanging around after having just been used for a different project. They were still configured as part of their test domain contoso.com with a NetBIOS name of CONTOSO. Now, I might have expected there to be some problems based on the recurring NetBIOS domain name, but when I deployed the VMs everything worked out fine. Since I had the six VMs split over two machines, I threw a crossover cable onto the second network adapters on the hosts, changed the Internal network to bind to those adapters, and away I went.
Unfortunately, I needed to be able to pass data in and out of the test network, so I did was has become my standard practice: shut down one of the machines (usually the DC, since it inevtiably ends up doing the least amount of work in my labs) and configure a second network interface, bound to the external network. I made the changes and started the VM back up...and suddenly the host falls off the network. Can't talk to the VS web page, can't ping it, can't RDP to it. I walk into the server room and check the console -- no, it's still running, but I can't logon using the cached domain credentials and the local admin password isn't taking. What's going on here?
To make a long story short, I did more poking around and finally came up with a theory. As far as I can tell, what happened was that when the DC started up, about the time the Netlogon service started up and it began registering itself with NetBIOS, it would cause the host's network stack to flip out. The host, you see, was a member of a CONTOSO domain already, and to have another one get announced through its physical interface was apparently too much. My fix was to remove the external network interface; instead, I'll just use a two-step copy process to move data in and out (once from the external network to the host, the second from the host to the guest).
I don't have time to play with it and figure out exactly what happened; is it a quirk of this network driver, of 64-bit Windows Server, or a generic "don't do this" kind of bug with VMs? Does it happen under VPC too? While I wish I could track this down, the reality is that I don't have the time.
[1] Problem Exists Between Chair And Keyboard. Related to the infamous I-D-Ten-T error.