Zed Shaw is an up and coming programmer in the Ruby world. He’s the creator of the popular Mongrel webserver, and is building a reputation for fast, solid, secure code. In this interview, he discusses Mongrel, Ruby, and his path to better code.
How did you come to Ruby?
Zed I actually came to Ruby years back while developing the first version of a weird revision control tool I was playing with called FastCST. I tried Ruby out but didn’t quite see the point and so went back to using C. Then I read Curt Hibbs’ article and realized, “Hey, they do domain specific languages with that thing!” Right after that I started a Ruby version of FastCST and then became distracted with work and several other weird projects related to Ruby on Rails.
Ah, there’s the Rails link that most people expect when some says they use Ruby. How much of your work is in Ruby vs Ruby on Rails?
Zed I’d say the majority of my work with Mongrel is in Ruby in order to support Rails and the other frameworks that run with Mongrel, but I work with Rails at my day job.
Zed What I like about Ruby is how you can express statements succinctly but still clearly so that other people can read your code. It has warts but the speed that I code in Ruby is incredible. The language is just amazing for how it mixes domain specific language abilities with object oriented design to let me crank out fully functioning applications at prototyping speeds with production quality.
A lot of people are saying that Ruby and Rails aren’t ready for the enterprise. What’s your take on this?
Zed Before answering this I’ll have to clarify the term “enterprise” into something people can talk about. Right now “enterprise” seems to mean three general things:
- “Big and expensive for running real businesses.”
- “Scales and performs well enough to meet my service demands.”
- “Has legally enforceable commercial support options to cover potential losses.”
Okay, let’s talk about each of those concepts. Ruby and Rails are both free, how does this square with spending a lot of money?
Zed The definition that “enterprise” means you always have to spend millions on hardware and software to run your business is just wrong. For years companies have been pushing this idea because they stand to benefit if you buy more of their products. The reality is that your solution needs to be tailored to the problem at hand and simply saying that you always need a giant solution means you aren’t really evaluating your needs. Rails demonstrates that this kind of “enterprise” has set bogus expectations for architectures, features, and such that aren’t needed and have had crappy ROI.
What Rails seems to be doing is proving that you can run large operations without spending tons of money on “enterprise” solutions. Yes Rails can and does run real businesses. It actually is making some businesses lots of money and is being used by many entrepreneurs to kick-start their ideas with little capital investment. I know of several small shops in New York that got their first sites up and running with only a few developers and started making money with only a minor investment. The fact that companies can reduce their initial risk of investment like this should be reason enough to use Rails to power “real businesses”.
Alright, what about the scalability issue?
Zed The folks who mean “scalable” when they say “enterprise” do have valid claims, many of which I’m trying to address with Mongrel. First off, there needs to be a redefinition of the term “scalable” away from “high performance” and back to “resource expandable”. Once you start to talk about performance and scalability separately you can give a more concrete answer to both concerns.
Rails scales (meaning expands to meet needs) just like any other web application framework technology. Mongrel makes this even easier since it is fast and HTTP based. If you were using Tomcat, Resin, WebLogic, or Apache+PHP before, then Mongrel running Rails pretty much just drops into the existing infrastructure.
I’ll be honest right away though and say that Ruby is slow. The Ruby community has been ignoring the huge “performance” elephant standing in the room and they need to start talking about it so it goes away. Elephants hate being talked about. There are a few efforts to make Ruby faster, but I see a lot less action than is needed to solve the problem. One solution in the works is a real virtual machine called Rite (or YARV depending on who you talk to) which is showing some real promise and seems to be speed competitive with the fastest Java implementations.
Ruby’s advantage though is not in it’s blazing execution speed but it’s blazing coding speed. I’ll put it to you this way: I wrote mongrel in about 3 months. That’s a full featured stable web server that can run four Ruby web application frameworks and is already powering many Ruby web sites. This wouldn’t be possible without Ruby the language.
Rails has a different situation from Ruby’s. Rails has this wonderful caching system that compensates for Ruby’s slow execution speed called “page caching” and “fragment caching”. Rails uses this to transfer the actual web traffic from Rails to the web server itself. This means that with careful planning of how you’ll cache parts of your Rails application you can get the same performance as your static file serving web server. Because of this a Rails application many times can outperform similar applications in Java or PHP.
And, what about that commercial support question?
Zed It’s a total no-deal for Rails right now. I agree with these folks that many organizations need a commercial support option and SLA before they invest in a technology. I really think the first company to do a serious good job at Rails commercial support will make a mint, but until then these organizations are just out of luck. If you look at PHP it really wasn’t until Zend started offering commercial support that big companies considered PHP a serious platform. Nothing really changed about PHP, it was just the perception that there was a company you could take to court now, so it was safer to use.
Before starting on Mongrel, you were working on SCGI. Can you explain a bit about their respective places in the web framework and how you see them playing together (or not).
Zed SCGI was my first attempt at doing a simple alternative to FastCGI. The main goal for SCGI was fast Rails hosting with only pure Ruby. This worked pretty well, but the reality is that SCGI has limited support in most web servers and doesn’t seem to be on the radar for future development. Lighttpd’s support was originally a bolt on modification of it’s mod_fastcgi. Apache’s module comes from outside the Apache project and the Apache project just announced heavier support for FastCGI rather than SCGI. Throw in the fact that many people have problems getting SCGI in Apache to talk to multiple backends and it’s not looking good for SCGI.
Mongrel originally started as an SCGI proxy designed to solve this problem. I wrote the HTTP parser and then started working on a C only proxy that’d answer HTTP requests and translate them to SCGI. About half way through I realized that the parser I wrote was good enough to just skip the middle-man and write a Web server directly in Ruby. About a day later Mongrel was born.
My plans for SCGI right now are to simplify it down to the absolute minimum necessary to run the protocol. SCGI currently has lots of DRb management code and stuff that some folks use (and abuse) but in general doesn’t help people who want to use SCGI. In order to keep the current crop of SCGI users supported I’ll “back port” some innovations from Mongrel–such as the thread model–and then simplify the whole package to more what SCGI was like in the earlier days.
Given that, I would like to stress that my future work will be with Mongrel and that I really think it is a much more capable way to support Ruby web applications. HTTP is just an easier protocol to deal with in terms of support and deployment tools.
If there’s anyone interested in taking over support for SCGI–maybe one of those companies that took it and started their next big product on it. It has a RubyForge project and I’d be ready to hand over the keys to anyone who’s interested and capable.
How does your work on Mongrel affect your Rails work, and vice versa?
Zed When working on Mongrel itself the Rails code I have to use is very minimal. This is done on purpose to keep Mongrel loosely coupled from Rails in case they change something in a release and break everything.
When I’m working with Rails, I use Mongrel actively and take notes for later improvement. This helps keep Mongrel real and keeps me from going into space inventing things for purely academic reasons. For example, I use Windows at work for development and it’s incredibly painful. Not so much because of Windows but because I seriously think nobody developing Rails actually even gets within five feet of a Windows computer. A lot of the enhancements that Luis Lavena made were touched up and improved simply because I wanted to make using Ruby on Rails easier for me and other poor slobs forced to use Windows.
I know that you’ve worked with the Rails and Nitro camps (at least) to make Mongrel work with them. What have been the biggest obstacles?
Zed The best thing about making Mongrel “framework agnostic” is that other people take it and do unexpected things with it. I’m really the only person working to make Mongrel coexist with Rails. Some folks on the core Rails team mostly test it and use it when they do development, but myself, Luis Lavena, and a few others ended up doing all the work to make it production ready. Since Luis and myself are the only ones who need it to work with Rails in production that’s to be expected.
The Nitro, camping, and IOWA teams however mostly did the work for me on their frameworks. They took Mongrel, read the documentation, bugged me for initial help, but for the most part it’s been hands off. I think I helped Camping the most, but why (the lucky stiff) actually manages the Mongrel code related to Camping. He’s also just contributed back a nice changeset implementing solid large file uploads/downloads which I’ll put in the 0.3.12.5 release. Why says he’s doing DVD uploads/downloads off his ParkPlace project.
What does Mongrel give back to the projects that use it?
Zed The two biggest things that the projects all should start getting from Mongrel are security enhancements and win32 support.
Mongrel’s design is based around the idea that most of the security problems in HTTP servers comes from hand-coded parsers that are too “loose” with the protocol. Mongrel uses a generated parser (using Ragel) that is very strict and seems to block a huge number of attack attempts simply because it is so exacting. Since this protection comes at the HTTP level, any framework using Mongrel gets it for free.
In the EastMedia/VeriSign project we were seeing a bunch of attack attempts from a “security company”. I won’t name the company since I don’t want to give them any extra press, but they were running some kind of security scanning software against our machines (without asking first) that we hadn’t announced yet.
The beautiful part is that Mongrel blocked all of the attacks immediately at the HTTP protocol level and kicked them out without wasting any time. Meanwhile, Apache let the traffic right on through the proxy without even a warning.
After they ran the automated scans we saw a few “hand coded” attacks which probably means someone at this “security company” was very intrigued by what Mongrel was doing.
The funniest part of this is that all Mongrel does is use a correctly coded parser based on a real grammar and a parser generator (Ragel). Other web servers use hand coded HTTP parsers that turn out to be vulnerable, difficult to compare to the real HTTP 1.1 RFC grammar, and are just a pain to manage. Using Ragel makes Mongrel robust against many of these attacks without actually having to create specific logic for detecting “attacks”.
The second benefit other projects are getting from Mongrel is win32 support from Luis Lavena. After Mongrel’s success on the win32 platform I started seeing messages saying that Luis was helping other projects get solid win32 capabilities. The rumors suggest that Luis and friends might actually open up a whole Ruby world for win32 users. I’m hoping that this brings some help to Daniel Berger’s win32utils project as well.
What I’d like to see on the win32 front is the Ruby One-Click Installer pick up all the win32 support natively and include it by default. Better yet, I’d like to see Ruby do what Python does and include the win32utils stuff as a platform specific add-on or a gem folks can download.
One of the big drivers behind Mongrel is that it’s fast and mostly native Ruby. What’s your process for optimizing? What tools are you using?
Zed My main tool when trying to optimize (and also validate) C code is valgrind and kcachegrind. Both are fantastic for free tools, but sadly Ruby does not run well under valgrind. In fact, valgrind dies on even “hello world” with 30k errors before the program has started. What I did initially with the HTTP parser was I wrote a little harness that let me run the parser under valgrind and then tuned it with kcachegrind.
The rest of my performance analysis comes from setting up a series of test applications that I then hit with httperf to measure their speed. I keep a log while I’m working on it and make sure that the performance doesn’t drop. If I make a change with an expected performance boost and it doesn’t do anything then I evaluate it again and try something else that might work.
The whole process is really just the scientific method. Since I have limited information from Ruby about performance I have to just test, evaluate, adjust, and repeat until the measurements improve. What really helps is using statistical tests to confirm that each change made a difference, or at least didn’t hurt things. Without these tests I could make changes that seemed to improve things but actually made no difference.
I also use Ruby’s profiling library, but I can only do that in very limited tests where only Mongrel is running. When Mongrel runs the other application frameworks the framework code drowns out any Mongrel related performance data and doesn’t give me any decent information.
A good example of this is in a simple test I have that returns HTTP request parameters as a YAML dump. I can’t use this test for profiling because the YAML library is such a pig that all of the profiling information is about YAML. Mongrel is just a little blip. Rails or Camping does the same thing so profiling turns out to be more about them than about Mongrel.
When I get really serious about performance I use R and run planned measured tests with statistical evaluations. This involves more planning than most people are familiar with (as I’ve ranted about before) and I usually only do it if money is on the line since it takes a large amount of time to get right.
Recently, you’ve been putting a lot of work (beyond unit tests) into making Mongrels stable and secure. Would you explain your methodology and the tools you’re using to make it work?
Zed I agree with the OpenBSD group’s assertion that security holes come from defects in general, not from some specific “security hole” that you look for in the source code. This means that I think if I fix all the defects I can find, and try to be proactive about potential errors then I’ll prevent a lot of security holes in the process.
With any of my projects I try desperately to do the following:
- Keep the code as incredibly simple as possible. I call this “The Shibumi School of Software Structure” because I like the letter ‘S’ and because it’s the exact inverse of what most programmers do when they structure software.
- Code reviews of my own code before releasing, constantly trying to
- “missed assertions” — Unstated assumptions about inputs and outputs.
- “missed else” — Logical branches that don’t cover all test domains.
- “will it stop” — Looping errors that will cause classic infinite loops or short loops.
- “check that return” — Return values that aren’t dealt with properly (which are really assumptions about other inputs and outputs).
- “unexpected exceptions” — Exceptions are pretty darn evil since they’re rarely documented.
- “simply readable” — Replacing clever code with readable simple code where possible, and documenting complex code so it can be reviewed by others.
- Unit testing as much as possible. When writing networking software unit tests become really difficult since you can only actually test it over a network of some kind.
- External thrashing and performance tests trying to break the system with unexpected inputs. Techniques I use are fuzzing, heavy loads, stopping interactions violently mid-stream, ripping out resources at random, and trying to think about ways someone could attack the system.
- Usability reviews from potential or current users. My motto here is “If I KMFU (Know My F*ing Users) they won’t have to RTFM.” I really think if a system is easy to use then the security concerns are lower, but I don’t have much evidence to support this claim.
Since I’m just one guy doing all this — and since this is supposed to be fun for me — I don’t follow all of these as religiously as I would in a professional setting. When Mongrel got some real funding I took the above steps more seriously. For about 3 weeks I was increasing the number of unit tests, slowing down releases so I could do code reviews, and I grabbed Peach Fuzzer to get some simple thrashing and fuzzing tests going.
The end result was that Mongrel ended up being able to stop a large number of attacks directly at the protocol level. This doesn’t mean Mongrel is impenetrable, but I think it’s on the road to being one of the most secure web servers out there. Of course now every hax0r will try to break Mongrel but I predict any future attacks will exploit flaws in Ruby or in the application frameworks rather than in Mongrel. Not much I can do about that right now though.
What do you think Mongrel needs to take it to the next level?
Zed A big component of my Mongrel work in the near future will be simply improving the deployment documentation people use. So far there’s just a document on setting up lighttpd, but what we really need is some solid documentation on deploying production Mongrel clusters on various platforms. Once this kind of documentation is available people should start to get more comfortable with deploying Mongrel, especially if they are in an environment that already runs other application servers like Tomcat.
In fact, I tried to meet as many people as possible at Canada on Rails to convince them to try their applications on Mongrel and to sort out what kinds of deployment scenarios people are facing with their applications.
How did that go? How will you use that feedback in Mongrel?
Zed The specific Mongrel feedback I received was very positive, and the majority of it was not to me directly but from people recommending it to others. Actually some of it was pretty embarrassingly glowing which is fantastic since it means I’m on the right track. I am a little worried that people just haven’t ran into the big problems yet, but I’m guessing that Mongrel is really hitting the right spots.
The serious questions were the most valuable though. Many people asked questions about deployment that I’m hoping to address in a series of nice documents covering various deployment scenarios. Others asked about cluster management which I’m hoping Bradley Taylor from RailsMachine.com will solve with his upcoming cluster plugins for Mongrel. A few other asked about licenses which I’ll address in some FAQs.
The real tough questions seemed to be about how best to handle caching and distribute load for complex dynamic web sites. I really didn’t have an answer for these folks but I took their complaints and started formulating the base idea for my next project. I’m hoping this next effort will be another solution to the supposedly solved “caching problem”. The feedback I received from people about my ideas around caching were very enthusiastic so I think I’m on the right track there.
What’s holding Mongrel back?
Zed I’d say the biggest obstacle has been getting Mongrel accepted as a production platform. Developing Mongrel has been great. I get love letters nearly every day from the community saying how much they think Mongrel rocks. The only missing piece is a few good huge production deployments using Mongrel. I’m thinking that these will start happening in the next few months as people start to deploy the applications they’ve been developing.
What are your 5 favorite libraries/frameworks for Ruby (whether in the standard library, or off the ‘Net)?
Zed I really dig why-the-luck-stiff’s Camping framework. It’s amazing how much voodoo why put into that thing in such a small space. Mongrel has more than a few bits of code or ideas borrowed from it.
I also use webgen quite extensively to manage the Mongrel website. It’s a great way to generate static sites from a small set of pages written in wiki format.
I also really like this tiny fast little Java webserver called Simple. When I started Mongrel I studied Simple and adopted it’s Handler setup. Simple’s got some other odd features–like parsing responses to correct them–but still remains remarkably small and fast. If I were ever going to do a Rails competitor in Java, Simple would be at the center.
The only web performance measurement tool I’ll advocate to people these days is httperf. It’s the only one that gives accurate statistics, breaks the entire request/response chain down, accurately reports socket errors, has exact definitions of what each measurement means and doesn’t claim to measure “users”.
I also really like Lua as a light alternative to Ruby. It’s fast, real tiny, embeds into other programs well, and has a syntax that’s close enough to Ruby to not seem entirely foreign. I’ve been looking to use Lua in a couple C only projects I have planned as the extension language.
What’s next for Ruby, Rails, Mongrel, and Zed?
Zed You’ll have to ask Matz about Ruby and David about Rails. What I can say is what I’d like to be next for Ruby and Rails.
For Ruby I’d like to see two efforts. First that it always runs clean under valgrind. This would go a long way to improving it’s stability and to keep it clean. The second is for all these people working on different “make Ruby faster” projects to pour their collective talents into making the Ruby 1.9 virtual machine fast and perfect.
For Rails I’d like to see a lot of the fat go away and for ActiveRecord to finally get a decent connection pooling system. By “fat” I mean stuff that I believe DHH is already planning on moving out into plugins like Active Web Service. For ActiveRecord there needs to be a solid effort to refactor it so that database connections are pooled in much the same way that Hibernate does. This is especially important for people using commercial databases that license based on the connection count.
Mongrel’s future is looking pretty bright (so bright I gotta wear shades). I’m making the push toward the first official production release, dubbed “Mongrel 0.4 Enterprise Edition 1.2″ since tacking “Enterprise Edition” on everything worked so well for Java. I’m also working with more companies to either provide services around Mongrel or to include Mongrel in potential products.
My next big project will be a special caching proxy server that I’m aiming at making any dynamic web applications much faster. While building and using Mongrel I’ve found that the whole caching situation with HTTP 1.1 uses very 1996 technology. I think I’ve got an idea that could solve the problem and potentially give many of these web applications huge performance and scalability boosts.
Zed A. Shaw is a professional software developer who’s been writing software for close to 13 years in industries ranging from government, academics, and commercial software and on applications ranging from security products to network protocols and web applications. He’s also dabbled in system administration, product development, usability engineering, and customer service. In his spare time he likes to write biographies so people think he’s super cool.
Pat Eyler is an Infrastructure Engineer for the LDS Church by profession, a Ruby geek by choice, and a writer by night. He enjoys reading, cooking, spending time with his family, and helping to build the Ruby community.