Wednesday afternoon I experimented for the first time with binding Panther Server to our Windows Active Directory. In the XServe’s Directory Access app, I selected Active Directory, entered the domain and forest, computer ID, and it all just worked - I could see all of the users and groups on the domain from the Mac’s Workgroup Manager. I was so impressed. Decided to wait for the next day to do more integration testing.
Thursday morning I got an urgent call from the sysadmin that many of the Windows machines on our network were unable to log on because DC “mulder” was not responding, and that according to the System Events logs, the AD services seemed to have crashed exactly when I bound the Mac server to the domain. I didn’t know what to say. I hadn’t intended for the Xserve to become a directory server (and had explicitly not set up that capability), just a client. How could this have happened?
Called Apple, and the rep said he had never heard of this happening before, and seriously doubted that the Mac could have been responsible for the outage. Our sysadmin called Microsoft, who stayed on the phone with him for a long time until they finished reconstructive surgery (first for domain services, and then for replication, which was also busted).
So what happened? The domain controller did not have its own machine name listed in its “machines” list. When I was filling out the Bind To tab, out of a combination of overzealousness, ignorance, and not RTFMing carefully enough, I entered the name of the machine I was binding to (mulder, the domain controller) into the ComputerID field. I know now that I should have entered the name of the Xserve into that field. When you bind, a key for the binding machine is created in the directory’s “Machines” list. And the Win2K domain controller happily created a machine there called “mulder.” This basically confused the hell out of mulder’s directory services and its ability to replicate with the other domain controller (scully). A case of identity crisis and overwritten kerberos keys. Everything seemed hunky dory from the Mac side, but directory services for the rest of the network were effectively fubar for the better part of a day. And the sysadmin justifiably hated me.
Whose fault was this? Mine, for not reading up carefully enough. Win2K’s, for being so willing to take suicide orders. And Apple’s, for not providing a little more guidance in what’s supposed to be entered into the ComputerID field (even pre-filling it with the Xserve’s hostname would have been enough of a clue to make me realize what was expected there).
How far should OS vendors go to prevent actions that can kill a network?