This 30-day project explores the refactoring of a legacy system. The Everything Engine is an aging software project that powers Perl Monks, Everything 2, and a few other websites. It suffers from poor design and maintainability.
Here’s what I learned about programming, refactoring, and 30-day projects during this series.
That’s it, this is the end of my first published thirty day project. I spent around thirty hours or so working on the Everything Engine, trying to refactor it into something a little more nice and usable. I had some successes and some failures and learned a lot. I hope you did too.
Why a thirty day project? First, because it’s a lot bigger than the usual “Hack for a couple of hours, then write a weblog about it.” I know 30 weblogs over a period of about six weeks is a lot to ask people to pay attention to, but I tried to throw in other publishing as well. (It’s a lot longer than my attention span, at least.)
I think it’s also nice to watch a project evolve over time. I tried to be very clear about my missteps and changes of direction. I like to think you’d have seen the same thing if you’d been pairing with me. That may or may not be valuable, but I do think it’s a noble goal.
I didn’t have an ultimate goal when I started the project, except to leave it in a cleaner state. On the whole, I think I did. I didn’t achieve as much as I wanted to achieve, but thirty hours of solid work represents about as much development work as I would do in an actual job during a week.
Thus if you look at the whole project as a single developer taking a week off to clean up the system (and if I were to do that in an ongoing, paying project, I would have more concrete goals), it looks pretty good.
The most important part of the retrospective is the technical part. That was the goal, not just to clean up a system but to document how I did it.
The system is, on the whole, cleaner than it was when I started. Even the first day made a big improvement; fixing the directory layout and making it easier to manage files in Subversion allowed me to spend less time thinking about how the project worked and more time making other changes.
I spent the most time refactoring the tests for all of the node classes. Part of this enabled me to improve test coverage dramatically, but most of it was so that I could perform further refactorings.
This is didactically valuable in that it shows how refactorings build on each other. Granted, the previous system design was far from a good design, so a project built with more discipline from the start would have needed less invasive changes.
Making that work required me to fix the design in some other parts of the system as well, notably improving the node method dispatch. This did remove some of the dynamism from the system (so it’s technically a behavior change and not a refactoring), but as a temporary change, I feel comfortable making it. It would have been very difficult to support the old behavior alongside the new.
I did get to make a few other code changes, though I didn’t take them as far as I wanted to. The database access scheme still needs some attention, but it’s getting much cleaner than it was. Every time I made a change to the system, I noticed how much extra coupling there was between components. There’s a lot of work to go, but every change makes things a little cleaner and other changes possible.
Toward the end of the series, when I started to concentrate on the database code, I realized that my ultimate goal had to be running the node tests against a live database. The mocking code was clever (and there was a lot of it), but I’d never get any more confidence in the system without much more integration.
I don’t have much of that yet, but I do have a start, so I know it’s possible and even doable.
My biggest success overall was discovering a new technique that combines
find it so useful that I do it by default now. (It also helped me fix some
SUPER.) I’ve since written about this in Mocks in Your Text Fixtures, on Perl.com.
Not everything went perfectly though. I’m pretty sure the system as a whole has some subtle bugs that weren’t there before I started making changes. (I know it had some even before that.) The test coverage still isn’t good enough. It’s better and it’s fixable, but I’m more aware than ever of how incomplete it is.
I spent a lot of time translating node tests to the new style by hand when I could have written a small program to do it for me. If I’d done this all in a week, I’d have automated the process. By spreading it out, I felt less pain at repeating myself.
I also feel like I made a bunch of little messes. I remember promising to clean up a few things later. Of course, that’s how refactoring goes. I turned a big tangle of code into several smaller tangles. Hopefully they’re more manageable, but I still feel a little bit guilty about not making perfectly clean code in only thirty hours. It’ll pass.
Mostly, I learned that refactoring a large system is more difficult than it seems — and that the ease of refactoring depends on the quality of the test suite.
We had a big test suite and it covered a lot of the system, but it didn’t cover it particularly well. I learned a lot from writing it (and have learned even more since then).
It’s also funny/ironic to see how some of the newer web application systems do what they do. Everything did a lot of that several years ago, but it never had quite the polish of the other systems. It would be nice to bring it up to date… but it needs some infrastructure work first.
This was my first thirty day project, so its worth thinking about how that worked too. I’d like to do another, but I’m not sure what it might be yet.
I really like the format and the discipline of choosing a workweek-sized task and documenting that honestly. It’s a nice size for a project and it’s not an investment too heavy for me or for readers. I hope.
I didn’t receive many public comments, but I did receive a fair amount of private feedback that people found the series valuable. For a subject as esoteric as the software behind a handful of (admittedly useful) sites, as well as class-based testing with Perl, it was nice to know that other people followed along. I hope the series is generally useful.
I also learned a lot about how to set up this series. I tried to work a week in advance, knowing that I’d publish three times a week. For the most part, that worked. I fell behind toward the end due to a vacation, but I don’t really regret that. (I wonder how it affected the readers, though.)
Our weblog instance here makes it difficult to categorize these posts. I hope we can rectify that soon.
I was never sure whether to post patches (especially 1000+ line monsters) or refer people to the Subversion repository with the appropriate checkin numbers. I almost always did the latter, but removing the code from the weblog, even by one click, likely reduced the number of people who read the code.
The series bogged down in the middle when I spent two weeks of entries
refactoring the test cases from the procedural form into the
Test::Class form. I know that was as boring to read as it was
to write. I promised that I’d be honest about the process though. If I do
this again, I’ll try to pick something with less tedium in the middle.
The three-a-week publishing schedule seemed to work. It gave me a chance to fit in the hour of work per virtual day in my normal schedule. Having a week or two in advance helps work around deadlines, so I’ll do that next time.
Starting a new project (or at least a new feature of an existing project) seems more interesting than refining an old project. Yet I’m not sure how pure research will go. For example, if I write a game in Ruby, will I run into trouble when I find out I don’t know how to use Rake effectively? I don’t know.
I don’t know how to get more people to read the code as I make it, and I don’t know how to handle potential contributions during the process. That may never come up, though.
I do think that getting the right project (and sticking to the schedule) will be more interesting to readers; maybe measuring that by feedback isn’t the right approach. However, I do still like this idea of learning in public. It would be nice to see other people do it.
I’m not sure what my next project will be, but I hope to decide soon. Thanks for reading along.
All Thirty Days
Based on reader feedback, here’s a list of links to all 30 entries.