The latest news from Skype about their recent major outage is pointing the finger towards the autmatic Windows Update feature. Well, to be fair, they are actually blaming a “a previously unseen software bug within the network resource allocation algorithm which prevented the self-healing function from working quickly.” But the latest post on the Skype blog indicates that the chain reaction was initiated when a large number of Skype clients were rebooted in the same timeframe, due to a routine Windows Update.
While it does seem plausible that a massive concurrent restart of Skype clients could cause some grief for Skype’s network, that doesn’t explain why it took 2 days to restore service. And I’m also left wondering why previous Windows Updates haven’t caused similar problems. What do you think, is there more to this story?


Bruce,
There is a clear explanation as to why the massive restart caused the grief for Skype network. They said it was a bug in their software. It is hard to blame MS when the reason is actually a bug in your software. Three days to find, fix and test a bug in a complex communication software and reinstall the update over all of many servers Skype has is pretty much a success.
I dunno, Bruce. I'm having a hard time buying Skype's explanation. I guess it is possible, but somehow it doesn't seem probable. If it was a combo of a Skype bug and Patch Tuesday, wouldn't have we seen something similar in some other similar service like the various IM networks/clients that would restart on reboot (or near reboot)? And, many of these IM clients also have VoIP/VON capabilities (Yahoo! Messenger, MSN Messenger, etc.). They can't all have perfect bug-free code. :-)