Web operations enthusiast, capacity planner, author
|
|
November 19 2009
Last week I gave 2 month’s notice – I’ll be leaving Flickr in January. When Stew and Cat asked me to join Flickr in January of 2005, I felt like it was time to go and do something different, so I said yes. Five years (and four billion photos) later, it’s again… read moreHow Complex Systems Fail: A WebOps Perspective
November 12 2009
I guess I’m late on getting to this, but How Complex Systems Fail by Richard Cook is excellent. Let me start with this: I don’t think I can overstate how right-on this paper is, with respect to the challenges, solutions, observations, and concerns involved with operating a medium to large web… read moreWhen you deploy: your internal monologue
October 07 2009
The minimum cycle of questions you should be asking yourself. As brought up by @debuggist and @benjaminblack. read moreOctober 05 2009
Like all sane web organizations, we gather metrics about our infrastructure and applications. As many metrics as we can, as often as we can. These metrics, given the right context, helps us figure out all sorts of things about our application, infrastructure, processes, and business. Things such as… What: …did we do… read moreWebOps: Good prep for becoming a new parent?
September 30 2009
I think I’ve said before somewhere that working in the field of web operations prepared me somewhat for being a parent. I thought the other day that I should write down some of this reasoning, because it’s pretty often that I’m reminded of similarities: High availability Having redundant infrastructure is WebOps 101.… read moreAutomated Control paper by the RAD Lab folks
August 01 2009
Wow, how did I miss this until now? In June, some smart people gathered in Barcelona for the First Workshop on Automated Control for Datacenters and Clouds (ACDC09) and jeez it looked like it was a good time, from a glance at the program. One of the cooler papers is “Automatic exploration… read moreExtreme Automated Infrastructure
July 18 2009
I’ve said it before that I’ve always been a huge fan of SystemImager, for super simple imaging. It has some shortcomings for config management, but those are solved with things like Chef or Puppet. With all of the great things being talked about surrounding ‘Automated Infrastructure’, I’ll point to something insanely… read moreJuly 16 2009
Excellent. Good work, Ben: ah, the mighty service level agreement! the tooth and claw by which the wily customer brings the vendor to heel. get the SLA right and you, the customer, can sit back and relax, safe in the knowledge that should there be an outage, you are covered. your… read moreUncaching bits in filesystem cache
July 09 2009
Domas makes something more useful than I bet most would think: http://mituzas.lt/2009/06/26/uncache/ read moreJune 23 2009
That was a blast! I had never done a ‘duet’ talk before. Here are the slides: 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr read moreMay 22 2009
I can’t tell you how ripped I get when people say things like this: “cloud computing means getting rid of ops” If by “ops” you mean “people in data centers racking servers, installing OSes, running cables, replacing broken hardware, etc.” then sure, cloud computing aims to relieve you of those burdens. If… read moreContext and Operational Metrics
May 11 2009
I really don’t think it can be overestimated how important context can be when it comes to troubleshooting or evaluating the health of an infrastructure. When starting to troubleshoot a complex problem, web ops 101 “best practices” usually start with asking at least these questions: When did this problem start? What… read moreMechanical Analogies To Web Stuff, Part 2.
May 06 2009
This is a ramble continued from before, which means it’s mostly a blog post for me, but maybe others might find it interesting. The last time I made an analogy between back-end web architectures and mechanical structures, I blathered on about what are basically structural limitations of individual components in… read moreSlides from Web2.0 Expo 2009. (and somethin else interestin’)
April 03 2009
That was a pretty good time. Saw lots of good and wicked smaht people, and I got a lot of great questions after my talk. The slides are up on slideshare, and here are the PDF slides. There was something that I left out of my slides, mostly because I didn’t… read moreWhy I didn’t include queueing math in my book.
March 25 2009
It’s been wondered about why I chose not to include any real amount of material in my book about the mathematical topics related to capacity planning, like queueing theory. There are already many other excellent books that dig into the math behind Little’s Law, M/M/1 queues, and Poisson arrival processes. These… read more